Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Merge Multiple records into one master record

Author  Topic 

cdex3
Starting Member

3 Posts

Posted - 2011-09-15 : 11:37:21
I have to merge multiple records into one master record. Some of the records have null values. I have deduped these records using the FuzzyGrouping in SSIS so each similar record has the same "CustFuzzyGroup_key_out" on which they should be grouped by. Looking for the best approach using SQL Server 2008 R2.

Thanks in advance.

The scripts are below.


--Table Schema
CREATE TABLE [#CustomerInfo](
[CustRowID] [int] NULL,
[CustFuzzyGroup_Key_Out_ID] [int] NULL,
[CustLocID] [int] NULL,
[CustNo] [varchar](17) NULL,
[FirstName] [varchar](25) NULL,
[MiddleName] [varchar](25) NULL,
[LastName] [varchar](40) NULL,
[AddressLine1] [varchar](45) NULL,
[City] [varchar](35) NULL,
[State] [varchar](3) NULL,
[ZipCode] [varchar](10) NULL,
[CellPhone] [varchar](10) NULL,
[BusinessPhone] [varchar](15) NULL,
[EmailAddress] [varchar](40) NULL,
[CreateDate] [datetime] NULL
) ON [PRIMARY]

GO

-- Test Data
INSERT INTO #CustomerInfo
( CustRowID
,CustFuzzyGroup_Key_Out_ID
,CustLocID
,CustNo
,FirstName
,MiddleName
,LastName
,AddressLine1
,City
,State
,ZipCode
,CellPhone
,BusinessPhone
,EmailAddress
,CreateDate
)
VALUES
( 25, 3, 204, '127489', 'Tommy', 'A', 'James', '123 Any Street', 'New York', 'NY', '10011', '2121231234', '', '', '2011-08-09 16:23:45' ),
( 18, 3, 255, '36987', 'Tommy', '', 'James', '123 Any Street', 'New York', 'NY', '10011', '', '2124441212
', 'corpemail@somecorp.com', '2011-09-09 00:00:00' ),
( 1, 3,173, '68974', 'Tommy', '', 'James', '123 Any Street', 'New York', 'NY', '10011', '', '2124155000', 'email@somemail.com', '2011-07-09 23:23:23' ),
( 31, 4, 89, '789657', 'Paula', 'P', 'Lady', '45 Main Street', 'Milwaukee', 'WI', '53111', '4148889797', '4148256000', 'email@somemail.com', '2011-08-07 23:00:00' ),
( 22, 4,58, '9897865T', 'Paula', '', 'Lady', '45 Main Street', 'Milwaukee', 'WI', '53112', '4148011234', '', '', '2011-09-03 23:00:00' )
GO


SELECT * FROM [#CustomerInfo]
GO

--DROP TABLE [#CustomerInfo]

-- Desired Results
/*
MasterCustID FirstName MiddleName LastName AddressLine1 City State ZipCode CellPhone BusinessPhone EmailAddress CreateDate
1 Tommy A James 123 Any Street New York NY 10011 2121111212 2124441212 corpemail@somecorp.com 9/9/2011
2 Paula P Lady 45 Main Street Milwaukee WI 53112 4148011234 4148256000 email@somemail.com 9/3/2011

*/

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-09-15 : 12:46:52
[code]
SELECT ci.*
FROM #CustomerInfo ci
INNER JOIN (SELECT CustFuzzyGroup_Key_Out_ID,MAX(CreateDate) AS Latest
FROM #CustomerInfo
GROUP BY CustFuzzyGroup_Key_Out_ID) ci1
ON ci1.CustFuzzyGroup_Key_Out_ID = ci.CustFuzzyGroup_Key_Out_ID
AND ci1.Latest = ci.CreateDate
[/code]

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

cdex3
Starting Member

3 Posts

Posted - 2011-09-15 : 15:21:22
Thanks for the reply Visakhm. I mistakenly entered empty strings as opposed to nulls in all of the non null fields. What I really want returned is the row with the max createdate and all non null values. eg for "Paula Lady." I actually want returned "Paula P Lady 45 Main Street Milwaukee WI 53112 4148011234 4148256000 email@somemail.com 9/3/2011.

Notice that the middle initial P, businessphone 4148256000 and email are from the earlier record. I need to return the max record and then any non null values that may exist in the previous rows if the row with the max date has nulls for those columns.

I hope this is clear.

Thanks again for the response.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-09-16 : 11:58:42
if there are multiple values existing in previous rows for a null valued field, which one should you return? what would be rule in that case?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

cdex3
Starting Member

3 Posts

Posted - 2011-09-16 : 15:34:09
In the case of multiple values I would take the one with the MAX CreateDate.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-09-17 : 02:31:46
[code]SELECT ci.CustRowID
,ci.CustFuzzyGroup_Key_Out_ID
,ci.CustLocID
,ci.CustNo
,ci.FirstName
,ci.MiddleName
,ci.LastName
,COALESCE(ci.AddressLine1,ci2.AddressLine1) AS AddressLine1
,COALESCE(ci.City,ci3.City) AS City
,COALESCE(ci.State,ci4.State) AS State
,COALESCE(ci.ZipCode,ci5.ZipCode) AS ZipCode
,COALESCE(ci.CellPhone,ci6.CellPhone) AS CellPhone
....
FROM #CustomerInfo ci
INNER JOIN (SELECT CustFuzzyGroup_Key_Out_ID,MAX(CreateDate) AS Latest
FROM #CustomerInfo
GROUP BY CustFuzzyGroup_Key_Out_ID) ci1
ON ci1.CustFuzzyGroup_Key_Out_ID = ci.CustFuzzyGroup_Key_Out_ID
AND ci1.Latest = ci.CreateDate
OUTER APPLY(SELECT TOP 1 AddressLine1
FROM #CustomerInfo
WHERE CustFuzzyGroup_Key_Out_ID=ci.CustFuzzyGroup_Key_Out_ID
AND AddressLine1 IS NOT NULL
ORDER BY CreateDate DESC)ci2
OUTER APPLY(SELECT TOP 1 City
FROM #CustomerInfo
WHERE CustFuzzyGroup_Key_Out_ID=ci.CustFuzzyGroup_Key_Out_ID
AND City IS NOT NULL
ORDER BY CreateDate DESC)ci3
OUTER APPLY(SELECT TOP 1 State
FROM #CustomerInfo
WHERE CustFuzzyGroup_Key_Out_ID=ci.CustFuzzyGroup_Key_Out_ID
AND State IS NOT NULL
ORDER BY CreateDate DESC)ci4
OUTER APPLY(SELECT TOP 1 ZipCode
FROM #CustomerInfo
WHERE CustFuzzyGroup_Key_Out_ID=ci.CustFuzzyGroup_Key_Out_ID
AND ZipCode IS NOT NULL
ORDER BY CreateDate DESC)ci5
OUTER APPLY(SELECT TOP 1 CellPhone
FROM #CustomerInfo
WHERE CustFuzzyGroup_Key_Out_ID=ci.CustFuzzyGroup_Key_Out_ID
AND CellPhone IS NOT NULL
ORDER BY CreateDate DESC)ci6
....
[/code]

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page
   

- Advertisement -