Author |
Topic |
anjali5
Posting Yak Master
121 Posts |
Posted - 2014-11-25 : 13:46:21
|
Hi All,I want to filter out the duplicate rows based on three columns. I got this quick query from Microsoft site to filter out the duplicate rows, but I am getting the result that filters out the non-duplicate one too. Below is the query;With Temp as ( SELECT row_number() over (partition by [id],[p_date], order by [id],[p_date],[code]) as Row, [id] ,[p_date] ,[code]from test) select * from Temp where Row = 1 I am getting the following results[code]1 NULL 2014-09-12 04:53:00 A1 NULL 2014-09-12 05:26:00 P 1 NULL 2014-09-12 05:31:00 P In the above case id is null, but in some rows id is not null . The above is obviously not duplicate. I am not sure what am I doing wrong.any help will be appreciated. |
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2014-11-25 : 14:12:06
|
I don't understand your post. It seems to contradict itself. Please show us sample data with the duplicates, and then using that same sample data show us what your query should output.Tara KizerSQL Server MVP since 2007http://weblogs.sqlteam.com/tarad/ |
|
|
gbritton
Master Smack Fu Yak Hacker
2780 Posts |
Posted - 2014-11-25 : 14:12:07
|
The query (which filters by two columns, not three) returns the unique rows, not the duplicates. Your results are the unique rows by id and p_date. It goes without saying that the rest of the rows in the table are duplicates of one of these three. |
|
|
anjali5
Posting Yak Master
121 Posts |
Posted - 2014-11-25 : 14:27:57
|
below is the sample data1 2014-09-03 04:48:00 NULL2 2014-09-03 04:49:00 NULL3 2014-09-03 04:50:00 NULL4 2014-09-03 04:51:00 NULL5 2014-09-03 04:52:00 NULLNULL 2014-09-03 04:47:00 1234NULL 2014-09-03 04:47:00 1234NULL 2014-09-03 04:47:00 1234NULL 2014-09-03 04:48:00 M I only want these rows from above dataNULL 2014-09-03 04:47:00 1234NULL 2014-09-03 04:47:00 1234NULL 2014-09-03 04:47:00 1234 below is the insert statementINSERT INTO [SportsStore].[dbo].[Test] ([ID] ,[p_date] ,[Code]) VALUES (NULL ,'2014-09-03 04:48:00' ,'M')SET ANSI_PADDING ONGOCREATE TABLE [dbo].[Test]( [ID] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, [p_date] [smalldatetime] NULL, [Code] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL) ON [PRIMARY]GOSET ANSI_PADDING OFF |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2014-11-25 : 14:32:56
|
Since they are duplicates, do you need all 3 returned? If you just need 1, here you go:select ID, p_date, Code, count(*)from Testgroup by ID, p_date, Codehaving count(*) > 1Tara KizerSQL Server MVP since 2007http://weblogs.sqlteam.com/tarad/ |
|
|
anjali5
Posting Yak Master
121 Posts |
Posted - 2014-11-25 : 14:34:27
|
quote: Originally posted by tkizer Since they are duplicates, do you need all 3 returned? If you just need 1, here you go:select ID, p_date, Code, count(*)from Testgroup by ID, p_date, Codehaving count(*) > 1Tara KizerSQL Server MVP since 2007http://weblogs.sqlteam.com/tarad/
I need all three to be returned. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2014-11-25 : 14:57:33
|
[code];With Temp as ( select ID, p_date, Code from Test group by ID, p_date, Code having count(*) > 1) select Temp.*from Tempjoin Test on COALESCE(Test.ID, '') = COALESCE(Temp.ID, '') and Temp.p_date = Test.p_date and Temp.Code = Test.Code[/code]Tara KizerSQL Server MVP since 2007http://weblogs.sqlteam.com/tarad/ |
|
|
anjali5
Posting Yak Master
121 Posts |
Posted - 2014-11-26 : 01:28:19
|
quote: Originally posted by tkizer
;With Temp as ( select ID, p_date, Code from Test group by ID, p_date, Code having count(*) > 1) select Temp.*from Tempjoin Test on COALESCE(Test.ID, '') = COALESCE(Temp.ID, '') and Temp.p_date = Test.p_date and Temp.Code = Test.Code Tara KizerSQL Server MVP since 2007http://weblogs.sqlteam.com/tarad/
Thanks. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|