Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
sqlnovice1
Starting Member
34 Posts |
Posted - 2011-03-27 : 12:35:09
|
| Hi Having recently set up a Data Warehouse at work, we have been asked to come up with a solution to the following. I was wondering if anyone counld advise on the best way to achieve this.We need to be able to count how many patients were in the hospital at any given time in the past. Each patient is given a an admissions and discharge date. if a patient does not have a discharge date then they are considered to still be in the hospital.I need to be able to enter a start and end date for any given period in the past (upto and including today) and have a count of patients who were admitted and discharged in that period, by dates within that period. Below is some sample data followed by expected results. Thanks in advance for any advice.SAMPLE DATA drop table PS_TestForOnlineCREATE TABLE PS_TestForOnline( rowkey int, crn int, admdate date , disdate date, )INSERT INTO PS_TestForOnlineVALUES('1','11111','2011/01/01','2011/01/07' );INSERT INTO PS_TestForOnlineVALUES('2','22222','2011/01/02','2011/01/07');INSERT INTO PS_TestForOnlineVALUES('3','33333','2011/01/03','2011/01/07');INSERT INTO PS_TestForOnlineVALUES('4','44444','2011/01/04','2011/01/07');INSERT INTO PS_TestForOnlineVALUES('5','55555','2011/01/05','2011/01/07' );INSERT INTO PS_TestForOnlineVALUES('6','66666','2011/01/06','2011/01/07' );INSERT INTO PS_TestForOnlineVALUES('7','77777','2011/01/07','2011/01/07');INSERT INTO PS_TestForOnlineVALUES('8','88888','2011/01/08', (NULL));select * from PS_TestForOnlineEXPECTED RESULTS - for period 1st to 8th JanCREATE TABLE PS_TestForOnline_Answer( DATE DATE , COUNT_OF_PATIENTS INT, );INSERT INTO PS_TestForOnline_AnswerVALUES('2011/01/01','1' );INSERT INTO PS_TestForOnline_AnswerVALUES('2011/01/02','2' );INSERT INTO PS_TestForOnline_AnswerVALUES('2011/01/03','3' );INSERT INTO PS_TestForOnline_AnswerVALUES('2011/01/04','4' );INSERT INTO PS_TestForOnline_AnswerVALUES('2011/01/05','5' );INSERT INTO PS_TestForOnline_AnswerVALUES('2011/01/06','6' );INSERT INTO PS_TestForOnline_AnswerVALUES('2011/01/07','7' );INSERT INTO PS_TestForOnline_AnswerVALUES('2011/01/08','1' );select * from PS_TestForOnline_AnswerPaul |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-03-27 : 14:28:13
|
Paul if you need to get this information for a given date, it seems to me that all you have to do is to count the rows where admit date is earlier than the date and discharge date is later. In code, that would bedeclare @date datetime;set @date = '20110105';select count(*) as count_of_patientsfrom PS_TestForOnlinewhere admdate <= @date and (disdate >= @date or disdate is null); If you need to do this for a range of dates, you would need a calendar of some sort. If you have a table of numbers in your database, you can use that. If not, you can construct one on the fly - as in the code below:declare @startDate datetime, @endDate datetime;set @startDate = '20101230';set @endDate = '20110112';with dates as -- this part is just generating a calendar.( select @startDate as [date] union all select dateadd(dd,1,[date]) from dates where [date] <= @endDate)select [date], p.count_of_patientsfrom dates d cross apply ( select count(*) as count_of_patients from PS_TestForOnline where admdate <= d.date and (disdate >= d.date or disdate is null) ) p |
 |
|
|
Sachin.Nand
2937 Posts |
Posted - 2011-03-27 : 14:47:38
|
Just a simple partition by clause will suffice the requirement.select * from(select *,ROW_NUMBER()over(partition by disdate order by rowkey)Count_Of_Patients from PS_TestForOnline)T where disdate='yourdate' PBUH |
 |
|
|
sqlnovice1
Starting Member
34 Posts |
Posted - 2011-03-27 : 15:41:44
|
| Many Thanks for your speedy reply. This seems to give me the results i was expecting. It was the second option that i needed.Thanks again P. |
 |
|
|
sqlnovice1
Starting Member
34 Posts |
Posted - 2011-03-28 : 04:52:43
|
| The solution provided by Sunitabeck works fine.Although when running against the live data in the warehouse it takes a long, long time. I am running it against a table of about 800,000 admissions.I does give me correct answer. i have indexed the date fields but it still takes a good 2 mins to do the last 2 months of activity.Does anyone have any ideas how to enhance performance??Thanks |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-03-28 : 06:54:47
|
Can you post the DDL for the table? Brett's post here http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx has instructions on how to get the DDL including indexes.Also, even though I don't think this is the major contributor to performance, you should get rid of that CTE:create table #Dates (date datetime not null primary key clustered);with dates as -- this part is just generating a calendar.( select @startDate as [date] union all select dateadd(dd,1,[date]) from dates where [date] <= @endDate)insert into #Dates select * from Dates; and then use the #Dates table in the rest of the query. |
 |
|
|
sqlnovice1
Starting Member
34 Posts |
Posted - 2011-03-28 : 07:31:09
|
| I added in the #temp_table and that has reduced the execution time down to 26 seconds for 2 months worth of data.the instructions from Bretts link are for 2005, i am working in 2008, are they different in this version or am i just looking in the wrong place?I will try and get you the DDl soon if you think it will enhance the query further.Thanks again for your help. |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-03-28 : 12:21:45
|
| Right click on the database in object explorer and then Tasks -> Generate Scripts. Select the database, and in script options for Table/View options, select Indexes, primary keys, unique keys etc. if they are not already selected and follow the wizard. |
 |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2011-03-28 : 12:52:48
|
| I can offer one piece of advice on the dates.Using a NULL for disdate to mean "not checked out yet" is not a good practice for a warehouse. Additionally, you are going to suffer a performance penalty. Which I think you are already discovering. You should, instead, be using some terminus date, like 9999-12-31 or whatever makes sense in your environment. |
 |
|
|
sqlnovice1
Starting Member
34 Posts |
Posted - 2011-03-29 : 06:25:51
|
| appologies Sunitabeck. I apparently do not have access rights to perform that action (DDL). The Data Warehouse was built by an external compnay who at the moment still have administrator control. We are due training soon and i will be given the correct access after that. I will come back to you when i have access. Once a agin thank you for all your help so far. P |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2011-03-29 : 07:24:01
|
| If you have already added the appropriate indexes, there may not be much I can tell you that would make it perform faster. Then again, my skill in making queries efficient leaves a lot to be desired, so someone else on this forum might have better insights if they see the indexes.I would take to heart Lamprey's advice about the null values if at all possible. That is likely to give you a boost in query performance. |
 |
|
|
|
|
|
|
|