Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 How to remove duplicate rows

Author  Topic 

cnbhold
Starting Member

43 Posts

Posted - 2011-08-01 : 14:21:46
I have a table that keeps track of a user’s PC Number (MachineID), software version (VersionID) and their user identification (UserID). The problem I’ve noticed is that a user has an entry in the table for the same Software version (VersionID) multiple times. I need a way to delete the rows that are duplicates.

TableName:

Software

Columns:

ID INT Primary Key,
MachineID INT,
VersionID INT,
UserID INT


Example of duplicates: The first three rows are duplicates

ID......MachineID....VersionID......UserID
1.......13467..........202036.........26709
2.......13467..........202036.........26709
3.......13467..........202036.........26709
5.......13467..........569877.........26709
5.......13467..........263298.........26709
6.......13467..........745691.........26709

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2011-08-01 : 14:53:50
[code]
Declare @t table (
id int identity(1,1),
mId int,
vId int,
uId int,
Primary Key(id)
)

Insert Into @t Select 13467, 202036, 26709
Insert Into @t Select 13467, 202036, 26709
Insert Into @t Select 13467, 202036, 26709
Insert Into @t Select 13467, 569877, 26709
Insert Into @t Select 13467, 263298, 26709
Insert Into @t Select 13467, 263298, 26709
Insert Into @t Select 13467, 745691, 26709


Select * From @t

Delete A
From @t A
Inner Join @t B
On A.mId = B.mId
and A.vId = B.vId
and A.uId = B.uId
and A.id > B.id

Select * From @t
[/code]

Corey

I Has Returned!!
Go to Top of Page

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2011-08-01 : 16:02:57
Every SQL forum or newsgroup expects that you will post DDL, sample data and clear specifications which might might include output. This is explained in the FAQ section. We can not read minds; what would we need to know to do your job for you?

Tables must have keys and should have DRI, constraints, and all the basic features of a schema. You should know use ISO-8601 Standards for temporal data, avoid needless dialect, basic data modeling and use ISO-11179 Standards for data element names. Please tell us if you can change the DDL.

Why do you have that magical "id" and no key? Do you really have PCs with integers as their serial numbers? Do you know what a PRIMARY KEY?

>> I need a way to delete the rows that are duplicates <<

No, you need a way to prevent garbage in the first place. Here is one way, assuming a 1:1:1 relationship.

CREATE TABLE Software_Assignments
(pc_serial_nbr VARCHAR(20) NOT NULL,
software_version_nbr INTEGER NOT NULL,
user_id INTEGER NOT NULL,
PRIMARY KEY (pc_serial_nbr, software_version_nbr, user_id));

What the rules about PC assignments in your company? Or homework problem.



--CELKO--
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-08-02 : 00:56:16
[code]DELETE t
FROM
(SELECT ROW_NUMBER() OVER(PARTITION BY MachineID,VersionID,UserID ORDER BY ID) AS rn
FROM Table
)t
WHERE rn>1
[/code]

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

cnbhold
Starting Member

43 Posts

Posted - 2011-08-05 : 09:54:36
Thanks for all the suggestions. I came across this solution that worked really well since this table had a Primary Key field.

--See How Many Duplicate Exists
SELECT MachineID, VersionID, UserID, COUNT(*) TotalCount
FROM DM_Emails
GROUP BY MachineID, VersionID, UserID
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC

DELETE
FROM DM_Emails
WHERE EmailID NOT IN
(
SELECT MAX(EmailID)
FROM DM_Emails
GROUP BY MachineID, VersionID, UserID)

Go to Top of Page
   

- Advertisement -