Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
LaurieCox
158 Posts |
Posted - 2013-12-05 : 09:18:19
|
I have this table:CREATE TABLE TestData( [ClientId] [int] NULL, [RevenueCode] [varchar](16) NULL, [ServiceDate] [date] NOT NULL, [ServiceEndDate] [date] NULL,) (test data at end of post)I need to combine rows of data where the service is continuous for a given ClientId/RevenueCode and compute the length of stay (in days) for each continuous stay. Example: From this data:Row ClientId RevenueCode ServiceDate ServiceEndDate DaysDiffer---- --------- ------------ ----------- -------------- -----------1 59937 124 2013-04-09 2013-04-09 n/a2 59937 124 2013-04-10 2013-04-10 13 59937 124 2013-04-11 2013-04-11 14 59937 124 2013-05-04 2013-05-04 235 59937 124 2013-05-05 2013-05-05 16 59937 912 2012-04-24 2012-04-25 n/a7 59937 912 2012-04-25 2012-04-26 08 59937 912 2012-04-26 2012-04-27 09 59937 912 2012-04-27 2012-04-28 0I should get this result:Row ClientId RevenueCode ServiceDate ServiceEndDate LengthOfService---- --------- ------------ ----------- -------------- ---------------1 59937 124 2013-04-09 2013-04-11 32 59937 124 2013-05-04 2013-05-05 23 59937 912 2012-04-24 2012-04-28 5(Row and DaysDiffer columns added to help with explaining problem) Rows 1 thru 5 (from input data) are all under the same revenue code (124) but there is a gap of 23 days between the ServiceEndDate of row 3 and the ServiceDate of row 4, These are two different stays (one that is 2 days long and one 3 days long). RevenueCode 912 has one stay which was 5 days long. Notice that sometimes the ServiceDate will be the same as the ServiceEndDate (of the previous record) and sometimes it will be one day later (DaysDiffer) column. Both of these indicate a continuous stay. There needs to be at least 2 days between the ServiceEndDate of one row and the ServiceDate of the next row to be a break in service and thus be a separate stay.My expected results from the test data below is:Row ClientId RevenueCode ServiceDate ServiceEndDate LengthOfService---- --------- ------------ ----------- -------------- ---------------1 8 124 2013-03-09 2013-03-13 52 364 100 2012-04-27 2012-05-15 193 364 100 2013-05-21 2013-05-31 114 5919 100 2012-03-05 2012-08-23 1725 5919 100 2013-04-26 2013-07-01 676 24077 124 2013-02-22 2013-02-26 57 24077 124 2013-06-09 2013-06-12 48 24077 124 2013-07-27 2013-07-30 49 59937 124 2013-04-09 2013-04-11 310 59937 124 2013-05-04 2013-05-05 211 59937 912 2012-04-24 2012-04-28 5 ThanksTestData:CREATE TABLE #TestData( [ClientId] [int] NULL, [RevenueCode] [varchar](16) NULL, [ServiceDate] [date] NOT NULL, [ServiceEndDate] [date] NULL,)insert into #TestDataSelect 8, 124, '03/09/2013', '03/13/2013' union allSelect 364, 100, '04/27/2012', '05/15/2012' union allSelect 364, 100, '05/21/2013', '05/31/2013' union allSelect 5919, 100, '03/05/2012', '04/01/2012' union allSelect 5919, 100, '04/01/2012', '05/01/2012' union allSelect 5919, 100, '05/01/2012', '06/01/2012' union allSelect 5919, 100, '06/01/2012', '07/01/2012' union allSelect 5919, 100, '07/01/2012', '08/01/2012' union allSelect 5919, 100, '08/01/2012', '08/23/2012' union allSelect 5919, 100, '04/26/2013', '05/01/2013' union allSelect 5919, 100, '05/01/2013', '06/01/2013' union allSelect 5919, 100, '06/01/2013', '07/01/2013' union allSelect 24077, 124, '02/22/2013', '02/22/2013' union allSelect 24077, 124, '02/23/2013', '02/23/2013' union allSelect 24077, 124, '02/23/2013', '02/24/2013' union allSelect 24077, 124, '02/24/2013', '02/25/2013' union allSelect 24077, 124, '02/26/2013', '02/26/2013' union allSelect 24077, 124, '06/09/2013', '06/09/2013' union allSelect 24077, 124, '06/10/2013', '06/10/2013' union allSelect 24077, 124, '06/11/2013', '06/11/2013' union allSelect 24077, 124, '06/12/2013', '06/12/2013' union allSelect 24077, 124, '07/27/2013', '07/27/2013' union allSelect 24077, 124, '07/28/2013', '07/28/2013' union allSelect 24077, 124, '07/29/2013', '07/29/2013' union allSelect 24077, 124, '07/30/2013', '07/30/2013' union allSelect 59937, 124, '04/09/2013', '04/09/2013' union allSelect 59937, 124, '04/10/2013', '04/10/2013' union allSelect 59937, 124, '04/11/2013', '04/11/2013' union allSelect 59937, 124, '05/04/2013', '05/04/2013' union allSelect 59937, 124, '05/05/2013', '05/05/2013' union allSelect 59937, 912, '04/24/2012', '04/25/2012' union allSelect 59937, 912, '04/25/2012', '04/26/2012' union allSelect 59937, 912, '04/26/2012', '04/27/2012' union allSelect 59937, 912, '04/27/2012', '04/28/2012' |
|
Ifor
Aged Yak Warrior
700 Posts |
Posted - 2013-12-05 : 12:25:37
|
You could use a calendar table (google this) and the row difference grouping technique.I am just generating the calendar table on the fly here.WITH CalendarAS( SELECT DATEADD(d, number, CAST('20120101' AS date)) AS cDate FROM master.dbo.spt_values WHERE [type] = 'P'),DistinctDatesAS( SELECT DISTINCT T.ClientId, T.RevenueCode, C.cDate AS ServiceDate FROM #TestData T JOIN Calendar C ON C.cDate BETWEEN T.ServiceDate and T.ServiceEndDate),GrpsAS( SELECT ClientId, RevenueCode, ServiceDate ,DATEDIFF(d, '20120101', ServiceDate) - ROW_NUMBER() OVER (PARTITION BY ClientId, RevenueCode ORDER BY ServiceDate) As Grp FROM DistinctDates),RangesAS( SELECT ClientId, RevenueCode, Grp ,MIN(ServiceDate) AS ServiceDate ,MAX(ServiceDate) AS ServiceEndDate FROM Grps GROUP BY ClientId, RevenueCode, Grp)SELECT ClientId, RevenueCode, ServiceDate, ServiceEndDate ,DATEDIFF(d, ServiceDate, ServiceEndDate) + 1 AS LengthOfServiceFROM Ranges; |
|
|
LaurieCox
158 Posts |
Posted - 2013-12-05 : 13:03:16
|
Hi Ifor,Thank you very much for your help.We do have a date table and when I replaced your calendar table with it everything worked great (I will have to test against the actual data).I am going to have to carefully walk thru your code so that I understand how it works. I have been reading up about the window functions (row_number, rank, etc) and had some idea that my solution lay somewhere with them but was unsure how to proceed. Can you point me to any articles about them that can help me wrap my head around it?I assume you picked the 2012/01/01 because it was before the minimum ServiceDate in my test data.Thanks again,Laurie |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2013-12-05 : 13:14:20
|
[code]CREATE CLUSTERED INDEX IDX_Clust ON #TestData([ClientId],[RevenueCode],[ServiceDate],ServiceEndDate)DECLARE @ClientId int,@RevenueCode varchar(100),@ServiceDate date,@ServiceEndDate dateSELECT TOP 1 @ClientId = [ClientId],@RevenueCode=[RevenueCode],@ServiceDate = [ServiceDate],@ServiceEndDate=ServiceEndDateFROM #TestDataUPDATE #TestDataSET @ServiceDate=[ServiceDate] = CASE WHEN [ClientId]= @ClientId AND [RevenueCode] = @RevenueCode AND [ServiceDate] > @ServiceDate AND [ServiceDate] <= DATEADD(dd,1,@ServiceEndDate)THEN @ServiceDate ELSE [ServiceDate] END,@ServiceEndDate=[ServiceEndDate] = CASE WHEN [ClientId]= @ClientId AND [RevenueCode] = @RevenueCode AND [ServiceEndDate] > @ServiceDate AND [ServiceEndDate] <= @ServiceEndDateTHEN @ServiceEndDate ELSE [ServiceEndDate] END,@ClientId = [ClientId],@RevenueCode=[RevenueCode]OPTION (MAXDOP 1)SELECT ClientID,RevenueCode,ServiceDate,MAX(ServiceEndDate) AS ServiceEndDate,DATEDIFF(dd,ServiceDate,MAX(ServiceEndDate)) + 1 AS LengthOfServiceFROM #testDataGROUP BY ClientID,RevenueCode,ServiceDateoutput---------------------------------------------------------------ClientID RevenueCode ServiceDate ServiceEndDate lengthOfService8 124 2013-03-09 2013-03-13 5364 100 2012-04-27 2012-05-15 19364 100 2013-05-21 2013-05-31 115919 100 2012-03-05 2012-08-23 1725919 100 2013-04-26 2013-07-01 6724077 124 2013-02-22 2013-02-26 524077 124 2013-06-09 2013-06-12 424077 124 2013-07-27 2013-07-30 459937 124 2013-04-09 2013-04-11 359937 124 2013-05-04 2013-05-05 259937 912 2012-04-24 2012-04-28 5[/code]------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/https://www.facebook.com/VmBlogs |
|
|
LaurieCox
158 Posts |
Posted - 2013-12-05 : 13:33:26
|
Hi visakh16Wow, now I have two different solutions to play with.I had found this thread where you had shown this technique for a very similar problem and was trying to figure out how to modify for my needs.I need to understand both solutions so that I can use them and modify them for other problems. What are the pros and cons between the two approaches? Thanks,Laurie |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2013-12-05 : 13:53:42
|
quote: Originally posted by LaurieCox Hi visakh16Wow, now I have two different solutions to play with.I had found this thread where you had shown this technique for a very similar problem and was trying to figure out how to modify for my needs.I need to understand both solutions so that I can use them and modify them for other problems. What are the pros and cons between the two approaches? Thanks,Laurie
My method is called quirky updates. It relies upon presence of clustered on the table and also has to be run without any parallelismmore info herehttp://visakhm.blogspot.in/2010/03/using-quirky-updates-to-develop-well.html------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/https://www.facebook.com/VmBlogs |
|
|
Ifor
Aged Yak Warrior
700 Posts |
Posted - 2013-12-05 : 13:58:11
|
>> I assume you picked the 2012/01/01 because it was before the minimum ServiceDate in my test data.CorrectThe trick with this technique is to choose two orders where the difference between them is the same when they should be in the same group.The value of the grp number does not matter as long as it is different to the other groups.Eyeball:WITH CalendarAS( SELECT DATEADD(d, number, CAST('20120101' AS date)) AS cDate FROM master.dbo.spt_values WHERE [type] = 'P'),DistinctDatesAS( SELECT DISTINCT T.ClientId, T.RevenueCode, C.cDate AS ServiceDate FROM #TestData T JOIN Calendar C ON C.cDate BETWEEN T.ServiceDate and T.ServiceEndDate)SELECT ClientId, RevenueCode, ServiceDate ,DATEDIFF(d, '20090722', ServiceDate) AS Number1 ,ROW_NUMBER() OVER (PARTITION BY ClientId, RevenueCode ORDER BY ServiceDate) AS Number2 ,DATEDIFF(d, '20090722', ServiceDate) - ROW_NUMBER() OVER (PARTITION BY ClientId, RevenueCode ORDER BY ServiceDate) As GrpFROM DistinctDatesORDER BY ClientId, RevenueCode, ServiceDate; |
|
|
LaurieCox
158 Posts |
Posted - 2013-12-05 : 14:37:00
|
Ifor, visakh16,Thanks for your responses. I will be looking at both solutions and reading visakh16's reference).I really want to learn. The trick is not to just take the solution given but to work with it, understand how it works and be able to apply that to other problems.Again thanks very much for all the help,Laurie |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2013-12-06 : 04:59:07
|
quote: Originally posted by LaurieCox Ifor, visakh16,Thanks for your responses. I will be looking at both solutions and reading visakh16's reference).I really want to learn. The trick is not to just take the solution given but to work with it, understand how it works and be able to apply that to other problems.Again thanks very much for all the help,Laurie
Appreciate that attitude You'll certainly make it big on sql one day if you have that attitude------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/https://www.facebook.com/VmBlogs |
|
|
|
|
|
|
|