removing duplicates in table2 when joining

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

removing duplicates in table2 when joining

Author

Topic

gagani
Posting Yak Master

112 Posts

Posted - 2011-12-07 : 08:27:20

table1:
orderid value result
101 60 passed

table2:
orderid value result
101 60 passed
101 0 none

table1 and table2 has same number of columns. table1 has complete data. table2 has less number of rows
compared to table1. but i want to extract the rows irrespective of data in table2 or not for orderids. so i
did left join. the problem is there are some duplicate orderids in table2. my requirement is i have to join
two tables and if there are duplicates for any particular orderid in table2, then i should take the value when result='passed'.
table1 should not be affected. result='passed' should be applied only for table2 when there are duplicates in tabl2 for a particular orderid

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-12-07 : 08:32:12

do you mean this?


update t2
set t2.value=t1.value
from table2 t2
inner join table1 t1
on t1.orderid = t2.orderid
and t2.result = 'passed'

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

gagani
Posting Yak Master

112 Posts

Posted - 2011-12-07 : 09:03:48

consider table1:

orderid value result
101 60 passed
102 40 none
103 80 none

table2:

orderid value result
101 60 passed
101 0 none
102 30 passed
102 0 none

I performed this query:

select t1.orderid, t1.value, t2.value from table1 t1 left outer join table2 t2
on t1.orderid = t2.orderid

the output is like this:
t1.orderid t1.value t2.value
101 60 60
101 60 0
102 40 30
102 40 0
103 80 0

but i want the output like this:
t1.orderid t1.value t2.value
101 60 60
102 40 30
103 80 null

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-12-07 : 09:05:09

[code]
select t1.orderid, t1.value, t2.value
from table1 t1
left outer join table2 t2
on t1.orderid = t2.orderid
and t2.result='passed'
[/code]

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

gagani
Posting Yak Master

112 Posts

Posted - 2011-12-07 : 09:45:20

that is missing out the row
103 80 null
of table1.

I want every row of table1 irrespective of any status. and if there are duplicates in table2 for a particular orderid, then it should consider the one with status= passed for that particular orderid in table2.

if there are no duplicates in table2, then it should give the value with whatever status
consider table1:

orderid value result
101 60 passed
102 40 none
103 80 none
104 40 none
table2:

orderid value result
101 60 passed
101 0 none
102 30 passed
102 0 none
103 55 none

the output should be
101 60 60
102 40 30
103 80 null
104 40 none

from the above, the duplicates are there for orderids 101 and 102 only. table2 does not have orderid 104, but it should come in the output as table1 has it. and for orderid 103, there is no duplicate in table2, so it should give the value of it.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-12-07 : 10:00:40

its working fine for me. you're probably not using it as suggested
see below


declare @table1 table
(
orderid int,
value int,
result varchar(100)
)
insert @table1
select 101, 60, 'passed' union all
select 102, 40 ,'none'  union all
select 103, 80, 'none' union all
select 104, 40, 'none'

declare @table2 table
(
orderid int,
value int,
result varchar(100)
)
insert @table2
select 101, 60, 'passed' union all
select 101, 0, 'none' union all
select 102, 30, 'passed'union all
select 102, 0, 'none' union all
select 103, 55, 'none'

select t1.orderid, t1.value, t2.value 
from @table1 t1 
left outer join @table2 t2
on t1.orderid = t2.orderid
and t2.result='passed'

output
-----------------------------------------
orderid	value	value
101	60	60
102	40	30
103	80	NULL
104	40	NULL

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

gagani
Posting Yak Master

112 Posts

Posted - 2011-12-07 : 11:10:25

sorry, it's my fault. I didn't give the output what i wanted correctly.
could you please consider the given tables again,
the output should be
101 60 60
102 40 30
103 80 55
104 40 null

from the above, for the orderid 103 it is taking the value from table2. the condition should apply only when there are duplicates in table2. if there are no duplicates in table2, it should retrieve the value from table2 irrespective of the status for the given orderid.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-12-07 : 11:17:35

[code]
select t1.orderid, t1.value,
COALESCE(MAX(CASE WHEN result='passed' THEN t2.value END),MAX(CASE WHEN result<>'passed' THEN t2.value END)) AS t2value
from @table1 t1
left outer join @table2 t2
on t1.orderid = t2.orderid
group by t1.orderid, t1.value
[/code]

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

gagani
Posting Yak Master

112 Posts

Posted - 2011-12-08 : 08:10:38

thank you for your query. I am trying to perform the same query for the condition datetimecolumn = max(datetimecolumn) instead of result='passed'. But i am confused as it would lead to one max function inside the case and the other outside.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-12-08 : 08:49:42

you cant use one max function inside other. which max(datetime) you're trying to look for?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Jeffreys
Starting Member

45 Posts

Posted - 2011-12-08 : 09:47:11

If I understand your question correctly, this may work. It seems you want to take the union of two tables. If duplicate orderid's exist you want to take the orderid with a 'passed' result. This should work assuming the result can be either 'passed' or 'failed'. If the failed value for result is a null, change the sort order to asc.

SELECT
orderId
,Value
,Result
FROM (
select
RowN = ROW_NUMBER() OVER (PARTITION BY orderID ORDER BY Result desc)
,orderId
,Value
,Result
FROM (
SELECT
orderId
,Value
,Result
FROM table1 t1

UNION ALL

SELECT
orderId
,Value
,Result
FROM table1 t2
)ivw
)ivw2
WHERE RowN = 1

gagani
Posting Yak Master

112 Posts

Posted - 2011-12-09 : 06:00:41

If there are duplicates in second table, then it should take the value when datetime column is maximum.
Earlier we considered the condition result=passed , user do not want that condition now. he only wants based on the datetime column is maximum.
consider table1:
orderid value datetimecolumn
101 60 23/11/2011
102 40 24/11/2011
103 80 25/11/2011
104 40 25/11/2011
table2:

orderid value result
101 60 23/11/2011
101 0 22/11/2011
102 30 26/11/2011
102 0 25/11/2011
103 55 23/11/2011

the output should be
101 60 60
102 40 30
103 80 55
104 40 null

the duplicates in second table are filtered based on maximum of datetime column.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-12-09 : 06:11:10

ok. see below


SELECT t1.orderid,t1.value,t4.value
FROM table1 t1
LEFT JOIN 
(SELECT t2.*
 FROM table2 t2
 INNER JOIN (SELECT orderid,MAX(result) AS MaxDate
            FROM table2
            GROUP BY orderid)t3
 ON t3.orderid = t2.orderid
 AND t3.MaxDate = t2.result
 )t4
ON t4.orderid = t1.orderid

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources