Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SSIS and Import/Export (2008)
 Create file of unique records

Author  Topic 

tmccar
Starting Member

27 Posts

Posted - 2012-07-30 : 05:16:58
I have an Excel file of customer records, with the usual fields for address, contact name etc. There are a lot of duplicate customer name records, which may have other relevant fields filled in. I would like to create a new file with unique customer records. This should be based on the first occurrence of a customer, and all the duplicate records should be merged with this.
How can I achieve this?

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2012-07-30 : 05:47:03
I would import then export.
How is a customer identified? If yoou have a customer no and a datestamp
;with cte as
(select *, seq = row_number() over (partition by CustomerNo order by dte desc from tbl)
select CustomerNo, Name, Address
from cte
where seq = 1



==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

tmccar
Starting Member

27 Posts

Posted - 2012-07-30 : 06:13:59
Thanks Nigel - the problem has been "condensed" into a slightly easier one, where I need just to identify duplicate customer ids. My master table looks like this:

ID -- Name
1275 -- Customer A
3472 -- Customer A
2812 -- Customer A
1245 -- Customer B
1544 -- Customer C
2567 -- Customer D
3446 -- Customer D

So, I have 3 duplicates for Customer A and 2 for Customer D. I want to take the first as the "master". The output table should look like this:

Master ID -- Slave ID
1275 -- 3472
1275 -- 2812
2567 -- 3446

And so on. (Each duplicate will cause a record to be written to the output file).
I'm using Excel at the moment, but I could switch to SQL. What is the easiest way to do this?
Go to Top of Page

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2012-07-30 : 07:03:00
;with cte as
(select *, seq = row_number() over (partition by Name order by ID) from tbl)
select ID, Name from cte where seq = 1

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

tmccar
Starting Member

27 Posts

Posted - 2012-07-30 : 07:48:33
I can't get the commands to work for me. I am using SQLite - do you know if these commands should work, or would I need to get something like SQL 2008 R2 Express?
Go to Top of Page

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2012-07-30 : 08:55:07
Didn't see your last bit.
row_number() and CTEs are a bit sql server specific - you get something similar on most databases but often a different syntax.

Try this
select MasterID = t2.ID, SlaveID = t1.ID
from tbl t1
join (select Name, ID = min(ID) from tbl group by Name) t2
on t1.Name = t2.Name

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

tmccar
Starting Member

27 Posts

Posted - 2012-07-30 : 10:11:18
Hi Nigel
Please clarify:
According to your code, should my 2 tables be named t1 and t2?
And is tbl a variable, pointing to table t1?

Thanks
Go to Top of Page

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2012-07-30 : 11:12:45
tbl is the name of your table - t1 and t2 are aliases used in the query.

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page
   

- Advertisement -