Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Delete Duplicate Records based on Group

Author  Topic 

CrazyRaider
Starting Member

2 Posts

Posted - 2011-02-03 : 10:51:33
have a database that I'm updating to be more streamlined. My table currently consists of:

ID | CODE | MODEL | DEVICE | MANUFACTURER | SUCCESS | ATTEMPTS | EMP_NUM | VOTE_DATE | ADD_DATE

My modified table consists of a new column called FAMILY which groups the models into a family that share the same values. That way I can have one record per family vs one for each individual model. I'll eventually remove the MODEL Column

ID | CODE | MODEL | FAMILY | DEVICE | MANUFACTURER | SUCCESS | ATTEMPTS | EMP_NUM | VOTE_DATE | ADD_DATE

I need to delete the duplicate records that share a common group (CODE | FAMILY | DEVICE | MANUFACTURER). The catch is I want to keep the record with the highest amount of attempts.

I have this simple SELECT statement that shows me the codes I want to keep.

SELECT CODE, FAMILY, DEVICE, MANUFACTURER, MAX(ATTEMPTS)AS ATTEMPTS
FROM codes_table
GROUP BY CODE, FAMILY, DEVICE, MANUFACTURER

I need to translate this into a DELETE statement.

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-02-03 : 10:58:03
delete tbl
from tbl t
join (SELECT CODE, FAMILY, DEVICE, MANUFACTURER, MAX(ATTEMPTS)AS ATTEMPTS
FROM codes_table
GROUP BY CODE, FAMILY, DEVICE, MANUFACTURER) a
on t.CODE = a.CODE
and t.FAMILY = a.FAMILY
and t.DEVICE = a.DEVICE
and t.MANUFACTURER = a.MANUFACTURER
where t.ATTEMPTS <> a.ATTEMPTS

do you have any with the same number of attempts?

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

CrazyRaider
Starting Member

2 Posts

Posted - 2011-02-03 : 11:47:27
quote:
Originally posted by nigelrivett

delete tbl
from tbl t
join (SELECT CODE, FAMILY, DEVICE, MANUFACTURER, MAX(ATTEMPTS)AS ATTEMPTS
FROM codes_table
GROUP BY CODE, FAMILY, DEVICE, MANUFACTURER) a
on t.CODE = a.CODE
and t.FAMILY = a.FAMILY
and t.DEVICE = a.DEVICE
and t.MANUFACTURER = a.MANUFACTURER
where t.ATTEMPTS <> a.ATTEMPTS

do you have any with the same number of attempts?

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.



There are codes that do have the same number of attempts. When the data was first entered they all received an attempt of 1. It is possible that not all codes have been voted on and only have 1 attempt. 1 is the lowest possible value, and > 1000. Thats why I'm updating the table. So that the family gets the attempts not the individual model.

Today I have 53409 records. Removing the duplicates I should have 17944. When I run the statement you provided its giving me 23485 records
Go to Top of Page

Ranjit.ileni
Posting Yak Master

183 Posts

Posted - 2011-02-04 : 00:22:40
Hi,
this may useful. I am giving example scenario .

create table #t
(
id int,
CODE varchar(5),
FAMILY varchar(5),
DEVICE varchar(5),
MANUFACTURER varchar(5)
)
go
insert into #t
select 1,'a','b','c','d' union all
select 1,'a','b','c','d' union all
select 2,'aa','bb','cc','dd' union all
select 2,'aa','bb','cc','dd' union all
select 3,'aaa','bbb','cbc','ddd' union all
select 3,'aaa','bbb','cbc','ddd'
go

select * from #t --6 rows
go

-- one method for Delete duplicates
delete T from
(
select row_number() over(partition by CODE, FAMILY, DEVICE, MANUFACTURER order by id) as row_number, * from #t
) T
where row_number>1
go

select * from #t --3rows
go



--Ranjit
Go to Top of Page
   

- Advertisement -