Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 HISTORIC SNAPSHOTS OF DATA - COUNT (*) - GROUP BY

Author  Topic 

sqlnovice1
Starting Member

34 Posts

Posted - 2011-03-27 : 12:35:09
Hi

Having recently set up a Data Warehouse at work, we have been asked to come up with a solution to the following. I was wondering if anyone counld advise on the best way to achieve this.

We need to be able to count how many patients were in the hospital at any given time in the past.

Each patient is given a an admissions and discharge date. if a patient does not have a discharge date then they are considered to still be in the hospital.

I need to be able to enter a start and end date for any given period in the past (upto and including today) and have a count of patients who were admitted and discharged in that period, by dates within that period.

Below is some sample data followed by expected results. Thanks in advance for any advice.

SAMPLE DATA


drop table PS_TestForOnline



CREATE TABLE PS_TestForOnline
(
rowkey int,
crn int,

admdate date ,
disdate date,

)

INSERT INTO PS_TestForOnline
VALUES('1','11111','2011/01/01','2011/01/07' );
INSERT INTO PS_TestForOnline
VALUES('2','22222','2011/01/02','2011/01/07');
INSERT INTO PS_TestForOnline
VALUES('3','33333','2011/01/03','2011/01/07');
INSERT INTO PS_TestForOnline
VALUES('4','44444','2011/01/04','2011/01/07');
INSERT INTO PS_TestForOnline
VALUES('5','55555','2011/01/05','2011/01/07' );
INSERT INTO PS_TestForOnline
VALUES('6','66666','2011/01/06','2011/01/07' );
INSERT INTO PS_TestForOnline
VALUES('7','77777','2011/01/07','2011/01/07');
INSERT INTO PS_TestForOnline
VALUES('8','88888','2011/01/08', (NULL));


select * from PS_TestForOnline


EXPECTED RESULTS - for period 1st to 8th Jan

CREATE TABLE PS_TestForOnline_Answer
(
DATE DATE ,
COUNT_OF_PATIENTS INT,

);

INSERT INTO PS_TestForOnline_Answer
VALUES('2011/01/01','1' );
INSERT INTO PS_TestForOnline_Answer
VALUES('2011/01/02','2' );
INSERT INTO PS_TestForOnline_Answer
VALUES('2011/01/03','3' );
INSERT INTO PS_TestForOnline_Answer
VALUES('2011/01/04','4' );
INSERT INTO PS_TestForOnline_Answer
VALUES('2011/01/05','5' );
INSERT INTO PS_TestForOnline_Answer
VALUES('2011/01/06','6' );
INSERT INTO PS_TestForOnline_Answer
VALUES('2011/01/07','7' );
INSERT INTO PS_TestForOnline_Answer
VALUES('2011/01/08','1' );


select * from PS_TestForOnline_Answer

Paul

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-03-27 : 14:28:13
Paul if you need to get this information for a given date, it seems to me that all you have to do is to count the rows where admit date is earlier than the date and discharge date is later. In code, that would be
declare @date datetime;
set @date = '20110105';

select
count(*) as count_of_patients
from
PS_TestForOnline
where
admdate <= @date
and (disdate >= @date or disdate is null);

If you need to do this for a range of dates, you would need a calendar of some sort. If you have a table of numbers in your database, you can use that. If not, you can construct one on the fly - as in the code below:
declare @startDate datetime, @endDate datetime;
set @startDate = '20101230';
set @endDate = '20110112';

with dates as -- this part is just generating a calendar.
(
select @startDate as [date]
union all
select dateadd(dd,1,[date]) from dates where [date] <= @endDate
)
select
[date],
p.count_of_patients
from
dates d
cross apply
(
select
count(*) as count_of_patients
from
PS_TestForOnline
where
admdate <= d.date
and (disdate >= d.date or disdate is null)
) p
Go to Top of Page

Sachin.Nand

2937 Posts

Posted - 2011-03-27 : 14:47:38
Just a simple partition by clause will suffice the requirement.



select * from
(
select *,ROW_NUMBER()over(partition by disdate order by rowkey)Count_Of_Patients from PS_TestForOnline
)T where disdate='yourdate'



PBUH

Go to Top of Page

sqlnovice1
Starting Member

34 Posts

Posted - 2011-03-27 : 15:41:44
Many Thanks for your speedy reply. This seems to give me the results i was expecting. It was the second option that i needed.

Thanks again P.
Go to Top of Page

sqlnovice1
Starting Member

34 Posts

Posted - 2011-03-28 : 04:52:43
The solution provided by Sunitabeck works fine.

Although when running against the live data in the warehouse it takes a long, long time. I am running it against a table of about 800,000 admissions.

I does give me correct answer. i have indexed the date fields but it still takes a good 2 mins to do the last 2 months of activity.

Does anyone have any ideas how to enhance performance??

Thanks
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-03-28 : 06:54:47
Can you post the DDL for the table? Brett's post here http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx has instructions on how to get the DDL including indexes.

Also, even though I don't think this is the major contributor to performance, you should get rid of that CTE:

create table #Dates (date datetime not null primary key clustered);
with dates as -- this part is just generating a calendar.
(
select @startDate as [date]
union all
select dateadd(dd,1,[date]) from dates where [date] <= @endDate
)insert into #Dates select * from Dates;

and then use the #Dates table in the rest of the query.
Go to Top of Page

sqlnovice1
Starting Member

34 Posts

Posted - 2011-03-28 : 07:31:09
I added in the #temp_table and that has reduced the execution time down to 26 seconds for 2 months worth of data.

the instructions from Bretts link are for 2005, i am working in 2008, are they different in this version or am i just looking in the wrong place?

I will try and get you the DDl soon if you think it will enhance the query further.

Thanks again for your help.
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-03-28 : 12:21:45
Right click on the database in object explorer and then Tasks -> Generate Scripts. Select the database, and in script options for Table/View options, select Indexes, primary keys, unique keys etc. if they are not already selected and follow the wizard.
Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2011-03-28 : 12:52:48
I can offer one piece of advice on the dates.

Using a NULL for disdate to mean "not checked out yet" is not a good practice for a warehouse. Additionally, you are going to suffer a performance penalty. Which I think you are already discovering. You should, instead, be using some terminus date, like 9999-12-31 or whatever makes sense in your environment.
Go to Top of Page

sqlnovice1
Starting Member

34 Posts

Posted - 2011-03-29 : 06:25:51
appologies Sunitabeck. I apparently do not have access rights to perform that action (DDL). The Data Warehouse was built by an external compnay who at the moment still have administrator control. We are due training soon and i will be given the correct access after that. I will come back to you when i have access.

Once a agin thank you for all your help so far. P
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-03-29 : 07:24:01
If you have already added the appropriate indexes, there may not be much I can tell you that would make it perform faster. Then again, my skill in making queries efficient leaves a lot to be desired, so someone else on this forum might have better insights if they see the indexes.

I would take to heart Lamprey's advice about the null values if at all possible. That is likely to give you a boost in query performance.
Go to Top of Page
   

- Advertisement -