Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
jbasso
Starting Member
2 Posts |
Posted - 2012-06-21 : 18:26:00
|
| I am trying to clean up a table that has 500,000 entries. The table consists of data where the entries were saved via an import program written in C++ which I have no control over.The problem is the program is creating almost duplicate entries when the name or address field are not exactly like a current entry.Sample:ID SubID Name Address12345 X001 Green, Robert Jr 123 N. First Street12345 X002 Green, Robert Jr. 123 N. First Street12345 X003 Green, Robert Jr. 123 N 1st Street12345 X004 Green, Robert 456 S. 3rd Street12345 X005 Green, Janice 456 S. 3rd StreetIt has to do with how the data is entered in the file that is imported which again I have no control over.Not sure if I posted in the wrong area or if there just hasn't been anyone that can answer this yet so I re-posting it here.What is my best option to report/cleanup so a query would return the truly unique records, such as:ID SubID Name Address12345 X001 Green, Robert Jr 123 N. First Street12345 X004 Green, Robert 456 S. 3rd Street12345 X005 Green, Janice 456 S. 3rd StreetAny one of the 3 in the sample that are matches would be acceptable to return.Thanks in advance for any help. |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2012-06-21 : 19:49:35
|
| you might have to define rules using which you need to identify similar ones and return unique fields------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
|
|
|