Author |
Topic |
Humate
Posting Yak Master
101 Posts |
Posted - 2010-08-04 : 09:45:31
|
Hi All,I would like to show unique records during any 5 minute period, where the Calltype and AgentID is the same. (Duplicates are indicated by a 1).The desired result is like below, where the rolling 5 minute time interval is considered. I'm struggling to think how I can find the necessary method to identify the duplicates.CallType---AgentID---Time---Round5Min---Duplicate---10000-----1215-----09:01:00-----09:00:00-----0-----12000-----1215-----10:01:00-----10:00:00-----0-----12000-----1215-----10:02:00-----10:00:00-----1-----12000-----1215-----10:02:31-----10:05:00-----1-----12000-----1215-----10:02:38-----10:05:00-----1-----12000-----1215-----10:05:59-----10:05:00-----1-----If I try to make a duplicate key from the Calltype, AgentId and time numbers, rounding to 5 minutes, I get the wrong behaviour - as seen below when the records round to 10:05:00.CallType---AgentID---Time---Round5Min---Duplicate---10000-----1215-----09:01:00-----09:00:00-----0-----12000-----1215-----10:01:00-----10:00:00-----0-----12000-----1215-----10:02:00-----10:00:00-----1-----12000-----1215-----10:02:31-----10:05:00-----0-----12000-----1215-----10:02:38-----10:05:00-----1-----12000-----1215-----10:05:59-----10:05:00-----1-----Same problem if I round up 5 the nearest mintues as seen below with the last row of data.CallType---AgentID---Time---RoundUP5Min---Duplicate---10000-----1215-----09:01:00-----09:00:00-----0-----12000-----1215-----10:01:00-----10:00:00-----0-----12000-----1215-----10:02:00-----10:00:00-----1-----12000-----1215-----10:02:31-----10:05:00-----1-----12000-----1215-----10:02:38-----10:05:00-----1-----12000-----1215-----10:05:59-----10:10:00-----0-----Any help on this appreciated.Thanks |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-08-04 : 10:01:35
|
[code]-- Prepare sample dataDECLARE @Sample TABLE ( CallType TINYINT, AgentID SMALLINT, [Time] TIME(0) )INSERT @SampleSELECT 10, 1215, '09:01:00' UNION ALLSELECT 12, 1215, '10:01:00' UNION ALLSELECT 12, 1215, '10:02:00' UNION ALLSELECT 12, 1215, '10:02:31' UNION ALLSELECT 12, 1215, '10:02:38' UNION ALLSELECT 12, 1215, '10:05:59'-- Display the wanted resultSELECT s.CallType, s.AgentID, s.[Time], CASE WHEN DATEDIFF(SECOND, f.[Time], s.[Time]) BETWEEN 1 AND 299 THEN 1 ELSE 0 ENDFROM @Sample AS sOUTER APPLY ( SELECT TOP(1) w.[Time] FROM @Sample AS w WHERE w.AgentID = s.AgentID AND w.CallType = s.CallType AND w.[Time] < s.[Time] ORDER BY w.[Time] ) AS f([Time])[/code] N 56°04'39.26"E 12°55'05.63" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-08-04 : 10:10:55
|
If you stop changing your original query, you actually might have a look at my suggestion. N 56°04'39.26"E 12°55'05.63" |
 |
|
Humate
Posting Yak Master
101 Posts |
Posted - 2010-08-04 : 12:10:58
|
thanks Peter, this looks interesting. Although I'm not finding the duplicates I need, I think because I specified time only (I actually have a datetime), the query returns the first unique record, not subsequent records in a later time period?? I've altered code below for an example I have tested.I want to use this query over a period of several months to find duplicates in any one period of 5 minutes. Record 50010 is what I want to identify as a duplicate again, does that make sense?DECLARE @Sample TABLE ( intLeadID INT, intCampaignID INT, intUserID INT, [dtmCreated] DATETIME )INSERT @SampleSELECT 47851, 5552, 93, '2010-07-08 14:33:37.013' UNION ALLSELECT 47852, 5552, 94, '2010-07-08 14:33:50.513' UNION ALLSELECT 47854, 5552, 55, '2010-07-08 14:35:20.453' UNION ALLSELECT 47855, 5552, 55, '2010-07-08 14:35:20.513' UNION ALLSELECT 47856, 5552, 55, '2010-07-08 14:36:13.437' UNION ALLSELECT 47857, 5552, 93, '2010-07-08 14:36:23.843' UNION ALLSELECT 47858, 5552, 93, '2010-07-08 14:36:41.280' UNION ALLSELECT 50009, 5552, 93, '2010-07-09 14:51:22.103' UNION ALLSELECT 50010, 5552, 93, '2010-07-09 14:54:28.513' SELECT s.intLeadID, s.intCampaignID, s.intUserID, s.dtmCreated, CASE WHEN DATEDIFF(SECOND, f.dtmCreated, s.dtmCreated) BETWEEN 0 AND 299 THEN 1 ELSE 0 ENDFROM @Sample AS sOUTER APPLY ( SELECT TOP(1) w.dtmCreated FROM @Sample AS w WHERE w.intUserID = s.intUserID AND w.intCampaignID = s.intCampaignID AND w.dtmCreated < s.dtmCreated ORDER BY w.dtmCreated ) AS f(dtmCreated) |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-08-04 : 14:20:46
|
My bad. The inner ORDER BY statement should DESCending.DECLARE @Sample TABLE ( intLeadID INT, intCampaignID INT, intUserID INT, dtmCreated DATETIME )INSERT @SampleSELECT 47851, 5552, 93, '2010-07-08 14:33:37.013' UNION ALLSELECT 47852, 5552, 94, '2010-07-08 14:33:50.513' UNION ALLSELECT 47854, 5552, 55, '2010-07-08 14:35:20.453' UNION ALLSELECT 47855, 5552, 55, '2010-07-08 14:35:20.513' UNION ALLSELECT 47856, 5552, 55, '2010-07-08 14:36:13.437' UNION ALLSELECT 47857, 5552, 93, '2010-07-08 14:36:23.843' UNION ALLSELECT 47858, 5552, 93, '2010-07-08 14:36:41.280' UNION ALLSELECT 50009, 5552, 93, '2010-07-09 14:51:22.103' UNION ALLSELECT 50010, 5552, 93, '2010-07-09 14:54:28.513' SELECT s.intLeadID, s.intCampaignID, s.intUserID, s.dtmCreated, CASE WHEN DATEDIFF(SECOND, f.dtmCreated, s.dtmCreated) BETWEEN 0 AND 299 THEN 1 ELSE 0 ENDFROM @Sample AS sOUTER APPLY ( SELECT TOP(1) w.dtmCreated FROM @Sample AS w WHERE w.intUserID = s.intUserID AND w.intCampaignID = s.intCampaignID AND w.dtmCreated < s.dtmCreated ORDER BY w.dtmCreated DESC ) AS f(dtmCreated) N 56°04'39.26"E 12°55'05.63" |
 |
|
Humate
Posting Yak Master
101 Posts |
Posted - 2010-08-04 : 14:44:59
|
Nearly there! It's found a lot more of the duplicates I was expecting on my full table, however I can see some not found, eg for agent 45 I only have one duplicate (green). I can't see why the red records are not identified?48172 5552 45 09/07/2010 12:36:33 048173 5552 45 09/07/2010 12:37:00 048176 5552 45 09/07/2010 12:38:24 048177 5552 45 09/07/2010 12:38:37 051459 11530 45 19/07/2010 12:51:07 051464 5552 45 19/07/2010 12:55:09 051465 11530 45 19/07/2010 12:55:14 151790 5552 45 20/07/2010 07:49:18 051791 5552 45 20/07/2010 07:49:50 051792 5552 45 20/07/2010 07:50:41 051827 2380 45 20/07/2010 08:48:20 051846 2380 45 20/07/2010 09:20:01 051852 5552 45 20/07/2010 09:31:25 051861 11323 45 20/07/2010 09:35:45 051873 2380 45 20/07/2010 09:43:03 051882 10976 45 20/07/2010 09:54:38 051886 8260 45 20/07/2010 09:59:19 051909 5552 45 20/07/2010 10:15:36 051910 11419 45 20/07/2010 10:15:50 052025 5552 45 20/07/2010 12:32:21 052026 5552 45 20/07/2010 12:32:29 056966 5552 45 30/07/2010 08:19:48 056968 5552 45 30/07/2010 08:20:21 056969 5552 45 30/07/2010 08:20:57 056970 5552 45 30/07/2010 08:21:39 056971 5552 45 30/07/2010 08:22:19 056972 5552 45 30/07/2010 08:23:03 056973 5552 45 30/07/2010 08:23:40 0 |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-08-04 : 15:44:32
|
Look ok to me...DECLARE @Sample TABLE ( SomeUnknownID INT, CallType SMALLINT, AgentID SMALLINT, [DateTime] DATETIME, SomeUnknownFlag TINYINT )SET DATEFORMAT DMYINSERT @SampleSELECT 48172, 5552, 45, '09/07/2010 12:36:33', 0 UNION ALLSELECT 48173, 5552, 45, '09/07/2010 12:37:00', 0 UNION ALLSELECT 48176, 5552, 45, '09/07/2010 12:38:24', 0 UNION ALLSELECT 48177, 5552, 45, '09/07/2010 12:38:37', 0 UNION ALLSELECT 51459, 11530, 45, '19/07/2010 12:51:07', 0 UNION ALLSELECT 51464, 5552, 45, '19/07/2010 12:55:09', 0 UNION ALLSELECT 51465, 11530, 45, '19/07/2010 12:55:14', 1 UNION ALLSELECT 51790, 5552, 45, '20/07/2010 07:49:18', 0 UNION ALLSELECT 51791, 5552, 45, '20/07/2010 07:49:50', 0 UNION ALLSELECT 51792, 5552, 45, '20/07/2010 07:50:41', 0 UNION ALLSELECT 51827, 2380, 45, '20/07/2010 08:48:20', 0 UNION ALLSELECT 51846, 2380, 45, '20/07/2010 09:20:01', 0 UNION ALLSELECT 51852, 5552, 45, '20/07/2010 09:31:25', 0 UNION ALLSELECT 51861, 11323, 45, '20/07/2010 09:35:45', 0 UNION ALLSELECT 51873, 2380, 45, '20/07/2010 09:43:03', 0 UNION ALLSELECT 51882, 10976, 45, '20/07/2010 09:54:38', 0 UNION ALLSELECT 51886, 8260, 45, '20/07/2010 09:59:19', 0 UNION ALLSELECT 51909, 5552, 45, '20/07/2010 10:15:36', 0 UNION ALLSELECT 51910, 11419, 45, '20/07/2010 10:15:50', 0 UNION ALLSELECT 52025, 5552, 45, '20/07/2010 12:32:21', 0 UNION ALLSELECT 52026, 5552, 45, '20/07/2010 12:32:29', 0 UNION ALLSELECT 56966, 5552, 45, '30/07/2010 08:19:48', 0 UNION ALLSELECT 56968, 5552, 45, '30/07/2010 08:20:21', 0 UNION ALLSELECT 56969, 5552, 45, '30/07/2010 08:20:57', 0 UNION ALLSELECT 56970, 5552, 45, '30/07/2010 08:21:39', 0 UNION ALLSELECT 56971, 5552, 45, '30/07/2010 08:22:19', 0 UNION ALLSELECT 56972, 5552, 45, '30/07/2010 08:23:03', 0 UNION ALLSELECT 56973, 5552, 45, '30/07/2010 08:23:40', 0SELECT s.SomeUnknownID, s.CallType, s.AgentID, s.[DateTime], s.SomeUnknownFlag, CASE WHEN DATEDIFF(SECOND, f.[DateTime], s.[DateTime]) BETWEEN 1 AND 299 THEN 1 ELSE 0 END AS FlagByPesoFROM @Sample AS sOUTER APPLY ( SELECT TOP(1) w.[DateTime] FROM @Sample AS w WHERE w.CallType = s.CallType AND w.AgentID = s.AgentID AND w.[DateTime] < s.[DateTime] ORDER BY w.[DateTime] DESC ) AS f([DateTime])ORDER BY s.CallType, s.AgentID, s.[DateTime] N 56°04'39.26"E 12°55'05.63" |
 |
|
Humate
Posting Yak Master
101 Posts |
Posted - 2010-08-04 : 16:48:31
|
Original one is working OK now.. hmmm. Maybe too a long a day for me!Thanks again Peter, this helps a lot. |
 |
|
Humate
Posting Yak Master
101 Posts |
Posted - 2010-08-13 : 15:09:03
|
OK - so I'm hoping to use this same theory in another problem, looking for duplicate telephone number records in a rolling 7 day period, and think I found my issue.I tried to implement the solution, but the Seconds difference calculates from the previous duplicate record, not since the first unique record in the series/period.In the below example I would expect CallID 3 to be Unique, and have a second difference of 604801 from CallID 1, but it is only 600301 having calculated difference from CallID 2.DECLARE @Sample TABLE ( CallID INT, TelephoneNo INT, [StartTime] DATETIME )INSERT @SampleSELECT 1, 901732, '2010-01-01 09:00:00.000' UNION ALL SELECT 2, 901732, '2010-01-01 09:30:00.000' UNION ALL SELECT 3, 901732, '2010-01-08 09:00:01.000' UNION ALL SELECT 4, 901732, '2010-01-19 09:00:00.000' UNION ALL SELECT 5, 123456, '2010-01-09 09:00:00.000' UNION ALL SELECT 6, 123456, '2010-02-09 09:00:00.000' SELECT s.CallID, s.TelephoneNo, s.StartTime, CASE WHEN DATEDIFF(SECOND, f.StartTime, s.StartTime) BETWEEN 0 AND 604800 THEN 0 ELSE 1 END AS UniqueFlag, DATEDIFF(SECOND, f.StartTime, s.StartTime) AS SecondDifferenceFROM @Sample AS sOUTER APPLY ( SELECT TOP(1) w.StartTime FROM @Sample AS w WHERE w.TelephoneNo = s.TelephoneNo AND w.StartTime < s.StartTime ORDER BY w.StartTime DESC ) AS f(StartTime) |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-08-13 : 15:33:33
|
If you changeORDER BY w.StartTime DESC toORDER BY w.StartTime ASC what happens then? N 56°04'39.26"E 12°55'05.63" |
 |
|
Humate
Posting Yak Master
101 Posts |
Posted - 2010-08-13 : 15:57:56
|
It doesn't work in a different way. I effectively want each record to be checked if the telephone number appears in last 7 days from it's Starttime. Not sure this is the way to do it.Another record added below to illustrate. CallID 7 should now be a duplicate because of CallID 6 appearing in last 7 days from it - it's not because the difference is GREATER than 604800 due to calculating from CallID 5. DECLARE @Sample TABLE ( CallID INT, TelephoneNo INT, [StartTime] DATETIME )INSERT @SampleSELECT 1, 901732, '2010-01-01 09:00:00.000' UNION ALL SELECT 2, 901732, '2010-01-01 09:30:00.000' UNION ALL SELECT 3, 901732, '2010-01-08 09:00:01.000' UNION ALL SELECT 4, 901732, '2010-01-19 09:00:00.000' UNION ALL SELECT 5, 123456, '2010-01-09 09:00:00.000' UNION ALL SELECT 6, 123456, '2010-02-09 09:00:00.000' UNION ALLSELECT 7, 123456, '2010-02-16 08:59:00.000' SELECT s.CallID, s.TelephoneNo, s.StartTime, CASE WHEN DATEDIFF(SECOND, f.StartTime, s.StartTime) BETWEEN 0 AND 604800 THEN 0 ELSE 1 END AS UniqueFlag, DATEDIFF(SECOND, f.StartTime, s.StartTime) AS SecondDifferenceFROM @Sample AS sOUTER APPLY ( SELECT TOP(1) w.StartTime FROM @Sample AS w WHERE w.TelephoneNo = s.TelephoneNo AND w.StartTime < s.StartTime ORDER BY w.StartTime ASC ) AS f(StartTime) |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-08-13 : 17:00:49
|
[code]DECLARE @Sample TABLE ( CallID INT, TelephoneNo INT, [StartTime] DATETIME )INSERT @SampleSELECT 1, 901732, '2010-01-01 09:00:00.000' UNION ALL SELECT 2, 901732, '2010-01-01 09:30:00.000' UNION ALL SELECT 3, 901732, '2010-01-08 09:00:01.000' UNION ALL SELECT 4, 901732, '2010-01-19 09:00:00.000' UNION ALL SELECT 5, 123456, '2010-01-09 09:00:00.000' UNION ALL SELECT 6, 123456, '2010-02-09 09:00:00.000' UNION ALLSELECT 7, 123456, '2010-02-16 08:59:00.000' SELECT f.CallID, f.TelephoneNo, f.StartTimeFROM ( SELECT CallID, TelephoneNo, StartTime, ROW_NUMBER() OVER (PARTITION BY TelephoneNo ORDER BY StartTime) AS RecID FROM @Sample ) AS sCROSS APPLY ( SELECT TOP(1) w.CallID, w.TelephoneNo, w.StartTime FROM @Sample AS w WHERE w.TelephoneNo = s.TelephoneNo AND w.StartTime >= DATEADD(SECOND, 604800, s.StartTime) ORDER BY w.StartTime ) AS f(CallID, TelephoneNo, StartTime)WHERE s.RecID = 1ORDER BY f.CallID[/code] N 56°04'39.26"E 12°55'05.63" |
 |
|
Humate
Posting Yak Master
101 Posts |
Posted - 2010-08-13 : 17:40:06
|
If it works in date order, I would expect the following results - last column idicates the unique calls in 7 day period with a value of 1.DECLARE @Sample TABLE ( CallID INT, TelephoneNo INT, [StartTime] DATETIME, IsAUniqueCall INT )INSERT @SampleSELECT 1, 901732, '2010-01-01 09:00:00.000', 1 UNION ALL SELECT 2, 901732, '2010-01-01 09:30:00.000', 0 UNION ALL SELECT 3, 901732, '2010-01-08 09:00:01.000', 0 UNION ALL SELECT 4, 901732, '2010-01-19 09:00:00.000', 1 UNION ALL SELECT 5, 123456, '2010-01-09 09:00:00.000', 1 UNION ALL SELECT 6, 123456, '2010-02-09 09:00:00.000', 1 UNION ALLSELECT 7, 123456, '2010-02-16 08:59:00.000', 0 SELECT * FROM @Sample |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-08-13 : 17:56:02
|
[code]DECLARE @Sample TABLE ( CallID INT, TelephoneNo INT, [StartTime] DATETIME )INSERT @SampleSELECT 1, 901732, '2010-01-01 09:00:00.000' UNION ALL SELECT 2, 901732, '2010-01-01 09:30:00.000' UNION ALL SELECT 3, 901732, '2010-01-08 09:00:01.000' UNION ALL SELECT 4, 901732, '2010-01-19 09:00:00.000' UNION ALL SELECT 5, 123456, '2010-01-09 09:00:00.000' UNION ALL SELECT 6, 123456, '2010-02-09 09:00:00.000' UNION ALLSELECT 7, 123456, '2010-02-16 08:59:00.000' ;WITH cteSource(CallID, TelephoneNo, StartTime, RecID)AS ( SELECT CallID, TelephoneNo, StartTime, ROW_NUMBER() OVER (PARTITION BY TelephoneNo ORDER BY StartTime) AS RecID FROM @Sample)SELECT f.CallID, f.TelephoneNo, f.StartTimeFROM cteSource AS sCROSS APPLY ( SELECT TOP(1) w.CallID, w.TelephoneNo, w.StartTime FROM @Sample AS w WHERE w.TelephoneNo = s.TelephoneNo AND w.StartTime >= DATEADD(SECOND, 604800, s.StartTime) ORDER BY w.StartTime ) AS f(CallID, TelephoneNo, StartTime)WHERE s.RecID = 1UNION ALLSELECT CallID, TelephoneNo, StartTimeFROM cteSourceWHERE RecID = 1ORDER BY CallID[/code] N 56°04'39.26"E 12°55'05.63" |
 |
|
Gilchrist
Starting Member
1 Post |
Posted - 2010-11-10 : 13:41:38
|
quote: Originally posted by Humate Hi All,I would like to show unique records during any 5 minute period, where the Calltype and AgentID is the same. (find duplicate files are indicated by a 1).The desired result is like below, where the rolling 5 minute time interval is considered. I'm struggling to think how I can find the necessary method to identify the duplicates.CallType---AgentID---Time---Round5Min---Duplicate---10000-----1215-----09:01:00-----09:00:00-----0-----12000-----1215-----10:01:00-----10:00:00-----0-----12000-----1215-----10:02:00-----10:00:00-----1-----12000-----1215-----10:02:31-----10:05:00-----1-----12000-----1215-----10:02:38-----10:05:00-----1-----12000-----1215-----10:05:59-----10:05:00-----1-----If I try to make a duplicate key from the Calltype, AgentId and time numbers, rounding to 5 minutes, I get the wrong behaviour - as seen below when the records round to 10:05:00.CallType---AgentID---Time---Round5Min---Duplicate---10000-----1215-----09:01:00-----09:00:00-----0-----12000-----1215-----10:01:00-----10:00:00-----0-----12000-----1215-----10:02:00-----10:00:00-----1-----12000-----1215-----10:02:31-----10:05:00-----0-----12000-----1215-----10:02:38-----10:05:00-----1-----12000-----1215-----10:05:59-----10:05:00-----1-----Same problem if I round up 5 the nearest mintues as seen below with the last row of data.CallType---AgentID---Time---RoundUP5Min---Duplicate---10000-----1215-----09:01:00-----09:00:00-----0-----12000-----1215-----10:01:00-----10:00:00-----0-----12000-----1215-----10:02:00-----10:00:00-----1-----12000-----1215-----10:02:31-----10:05:00-----1-----12000-----1215-----10:02:38-----10:05:00-----1-----12000-----1215-----10:05:59-----10:10:00-----0-----Any help on this appreciated.Thanks
Thank you i was in the same problem and thank god i found this forum. |
 |
|
|
|
|