Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
musicman
Starting Member
1 Post |
Posted - 2011-07-05 : 09:28:46
|
| I am trying to delete duplicate records from a table that contains 2 million records. The problem I am facing is that the query is taking several days to complete. My query is below, it is basically comparing each record with another record & if all columns are the same data, it deletes the record. This technique has been successful for me on tables with smaller amounts of records, but I am not sure what I can do to speed up the query. My estimation is that there should only be about 200,000 unique records in this table which would indicate approximately 1.8 million records need to be deleted.Anyone have any advise as to how I can accomplish this task quicker. Thanks in advance. delete from FOXPRO_COMPLETE.CLAIMANT_NAME_T awhere a.rowid > any (select b.rowid from FOXPRO_COMPLETE.CLAIMANT_NAME_T b where (a.CLMNT_ID = b.CLMNT_ID or (a.CLMNT_ID is null and b.CLMNT_ID is null))and (a.CLMNT_FRST_NME = b.CLMNT_FRST_NME or (a.CLMNT_FRST_NME is null and b.CLMNT_FRST_NME is null))and (a.CLMNT_LAST_NME = b.CLMNT_LAST_NME or (a.CLMNT_LAST_NME is null and b.CLMNT_LAST_NME is null))and (a.CLMNT_MI = b.CLMNT_MI or (a.CLMNT_MI is null and b.CLMNT_MI is null)) ); |
|
|
nigelrivett
Master Smack Fu Yak Hacker
3385 Posts |
Posted - 2011-07-05 : 09:36:38
|
| Get the unique rows firstselect CLMNT_ID, CLMNT_FRST_NME , CLMNT_LAST_NME , CLMNT_MI , rowid = min(rowid)into #afrom FOXPRO_COMPLETE.CLAIMANT_NAME_Tgroup by CLMNT_ID, CLMNT_FRST_NME , CLMNT_LAST_NME , CLMNT_MI now deletedelete FOXPRO_COMPLETE.CLAIMANT_NAME_Twhere rowid not in (select rowid from #a)You coukld do it in a single query but I prefer to see what's going to happen first.==========================================Cursors are useful if you don't know sql.SSIS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
|
Seventhnight
Master Smack Fu Yak Hacker
2878 Posts |
Posted - 2011-07-05 : 09:41:19
|
... and with a better answer I think.I would think something like this would be quicker... but it's still going to take some time to delete that many records.I assumed that CLMNT_ID was a positive integer (so -1 in place of null)And I assumed that an empty string is similar to a null for 'value' purposes...delete bfrom FOXPRO_COMPLETE.CLAIMANT_NAME_T aInner Join FOXPRO_COMPLETE.CLAIMANT_NAME_T bOn isnull(a.CLMNT_ID,-1) = isnull(b.CLMNT_ID,-1)and isnull(a.CLMNT_FRST_NME,'') = isnull(b.CLMNT_FRST_NME,'')and isnull(a.CLMNT_LAST_NME,'') = isnull(b.CLMNT_LAST_NME,'')and isnull(a.CLMNT_MI,'') = isnull(b.CLMNT_MI,'')and a.rowId < b.rowId Corey I Has Returned!! |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2011-07-05 : 09:42:21
|
Assuming you only want the minimum rowid for each entry, this is the fastest way in SQL Server to delete them:SELECT MIN(rowid) rowid, CLMNT_ID, CLMNT_FRST_NME, CLMNT_LAST_NME, CLMNT_MI INTO #tempFROM FOXPRO_COMPLETE.CLAIMANT_NAME_TGROUP BY CLMNT_ID, CLMNT_FRST_NME, CLMNT_LAST_NME, CLMNT_MI -- select rows to keep into a temp tableTRUNCATE TABLE FOXPRO_COMPLETE.CLAIMANT_NAME_T -- truncate original tableINSERT FOXPRO_COMPLETE.CLAIMANT_NAME_T SELECT * FROM #temp -- insert from the temp tableDROP TABLE #temp |
 |
|
|
nigelrivett
Master Smack Fu Yak Hacker
3385 Posts |
Posted - 2011-07-05 : 09:43:40
|
| I missed the goup by in my original post.Rob's suggestion only works if this is all the columns in the table.==========================================Cursors are useful if you don't know sql.SSIS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2011-07-05 : 09:57:09
|
If you use SQL Server 2005 and later, you can try thisDELETE fFROM ( SELECT ROW_NUMBER() OVER (PARTITION BY CLMNT_ID, CLMNT_FRST_NME, CLMNT_LAST_NME, CLMNT_MI ORDER BY RowID) AS RecID FROM FOXPRO_COMPLETE.CLAIMANT_NAME_T ) AS fWHERE RecID > 1 N 56°04'39.26"E 12°55'05.63" |
 |
|
|
|
|
|
|
|