Processing order of JOINs vs WHERE

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

Processing order of JOINs vs WHERE

Author

Topic

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-18 : 11:06:51

I've been looking online for this question, but can only find reference to explicit vs implicit joins, which is not what I need to know. Anyway...

Suppose I have an index on a table:


CREATE UNIQUE CLUSTERED INDEX [UCX_tbl_myTable_someDate_userID] ON [tbl_myTable] 
(
	[someDate] ASC,
        [userID] ASC
)

Then run a query such as this:


DECLARE @tmp TABLE (userID INT]
INSERT INTO @tmp (userID) VALUES (1)
INSERT INTO @tmp (userID) VALUES (5)
INSERT INTO @tmp (userID) VALUES (50)


SELECT 
  *
FROM
  [tbl_myTable] t INNER JOIN
  @tmp t2 ON t.userID = t2.userID
WHERE
  (t.someDate BETWEEN @date1 AND @date2)

1. Does SQL automatically know to use the clustered index for the query, looking up someDate first and then userID?
2. What would happen if there was no clustered index as shown, which comes first - JOIN or WHERE?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-07-18 : 11:12:09

see

http://www.bennadel.com/blog/70-SQL-Query-Order-of-Operations.htm

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-18 : 11:48:38

So if FROM comes before WHERE, am I right in thinking that the index column order needs to be reversed --> {userID, someDate}

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2012-07-18 : 12:12:38

You should try the index both ways and see what it does. More than likely, SQL will "colapse" the predicate in the where clause into the join predicate and it will will use either index. But, depending on the selctivity of each colum, UserID-Date might be better than Date-UserID, but hard for me to say without knowing your data.

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-18 : 12:12:41

Or, is it possible to put the date limiter within the FROM part of the statement?

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2012-07-18 : 12:17:41

the optimiser will take the query as a whole. Doesn't matter if the joins are in the where clause or not.
There are limitations to what the optimiser will find though (although it is very good) so you should check the query plan if it is causing problems.

For your query it will probably use the clustered index for everything as the table is so small it will all be in memory so using anything else would just mean extra reads.

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-18 : 12:26:39

Thanks Lamprey

The table holds login dates for users. The login date column always increases, so (based on another post from this board), I created a unique clustered index on the date column AND the userID column (as shown in the OP). This worked really well and gave a massive performance boost.

Now though, when I run a particular statement the query slows down massively. To be fair it was slow before the index change.
For example when I put all 6000 users into @tbl, the query takes 19 seconds, but using 2000 users takes only 1 second.

The execution plan only shows Index Seeks for each query, although a Table Scan on @tmp shrinks from 8% for 6000 users to 1% for 2000 users. Not sure how that adds up really!

I've reversed the index column order, and it made no difference, so I'm a bit confused...

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2012-07-18 : 12:45:57

quote:
Originally posted by R

Or, is it possible to put the date limiter within the FROM part of the statement?

Yes, you can.

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2012-07-18 : 12:52:13

Ahh, having that many rows in a table variable might explain some things. Table variables do not maintain statistics, so (if memory serves me) SQL Server will treat them at tables with one row for statistical purposes. You could try adding a PK to your table variable but, using temp table (also with a PK) will probably make more of a difference.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources