Best way to use a left join and prevent duplicates

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Best way to use a left join and prevent duplicates

Author

Topic

R J
Starting Member

2 Posts

Posted - 2010-09-12 : 17:10:00

I think I am looking for a result similar to the one mentioned in this thread:

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=117459

This solution works great in this example, but my query is a little different and I couldn't figure out how to make that method work for me.

Here is a simple example of the issue. This query might return duplicates, because more than one value might exist with the matching parameters:

select t1.*, t2.*
from tblTable1 t1
left join tblTable2 t2 on t2.Field1 = t1.Field1 and t2.Field2 = t1.Field2

So, to prevent duplicates, I would correlate a subquery:

select t1.*, t2.*
from tblTable1 t1
left join tblTable2 t2 on t2.ID = 
	(select top 1 ID from tblTable2 where Field1 = t1.Field1 and Field2 = t2.Field2)

Depending on the amount of data in these tables, this solution can be very slow. Is there a better way to write this? Any help would be greatly appreciated.

namman
Constraint Violating Yak Guru

285 Posts

Posted - 2010-09-12 : 20:21:00

Could you please, provide sample data and expected output?

quote:

Here is a simple example of the issue. This query might return duplicates, because more than one value might exist with the matching parameters:

select t1.*, t2.*
from tblTable1 t1
left join tblTable2 t2 on t2.Field1 = t1.Field1 and t2.Field2 = t1.Field2

So, to prevent duplicates, I would correlate a subquery:

select t1.*, t2.*
from tblTable1 t1
left join tblTable2 t2 on t2.ID =
(select top 1 ID from tblTable2 where Field1 = t1.Field1 and Field2 = t2.Field2)

With that limited info, I think it is likely your second query may return unexpected result.

quote:

Depending on the amount of data in these tables, this solution can be very slow. Is there a better way to write this?

We may use row_number() as another way to get as the same result as subquery does. Which one is better? It's depended. I know some DBAs try to avoid using subquery because it is bad performance. I'm disagree. For me, subquery is not as bad as it looks because SQL is a set-based solution ... In short, to determine which one is better, we need more info about the table structure, indexes on those tables ...

R J
Starting Member

2 Posts

Posted - 2010-09-13 : 02:46:18

I used a generic example on purpose because I was trying to find out how to write an equivalent SQL query using row_number(), if it's even possible. I don't have a specific need for this solution at this time, it's just a scenario that I encounter from time to time. In the past I have always found a way to make things work sufficiently, I just want to learn about other options, if they exist, so I know all of the options that I should be considering.

Is there a way to write this kind of statement using row_number()?

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources