Which index is chosen for a select query

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Development (2000)

Which index is chosen for a select query

Author

Topic

phylo
Starting Member

2 Posts

Posted - 2008-10-26 : 05:04:55

Hi all,
I have a table with a clustered index and several non clustered.
I'm getting some unexpected results with the "Display Estimated Execution Plan".

My Clustered index includes a timestamp (datetime) column and the id (bigint) column.
I have another index including userId (int) and type (smallint).

I wish to select old, irrelevant rows according to the timestamp and type. The Estimated Execution Plan shows a clustered index seek as I expect. (Cost 91%, the other 9% is Parallelism/Gather Streams, whatever that means)
If I attempt to select rows only according to the timestamp (ignoring the type filtering), the Plan shows the non clustered index seek, which seems to have nothing to do with the columns I have or want to select. Naturally, it's estimated row count is significantly higher, but the I/O cost is slightly lower.

Why does this happen? I always figured SQL will prefer the clustered, then non clustered and then full scan when the WHERE criteria has index columns matches. How does it determine which index to use?

Thanks for any input on this matter

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2008-10-26 : 06:08:04

The optimiser will pick the indexes that will result in the least work done. Typically, it will prefer nonclustered indexes if it can seek effectively on them, because they are smaller and require less IOs to read. If, however, the nonclustered indexes aren't covering and the query returns a lot of data, it may pick a clustered index scan (equivalent to a table scan) because that is cheaper than doing all the bookmark lookups. It depends on the query and the data though.

If you can post the table structure, the index definitions and the query, it will be easier to explain why you're seeing what you're seeing.

--
Gail Shaw
SQL Server MVP

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-10-26 : 07:19:28

make use of sp_helpindex 'yourtablename' to see what all indexes exist on table and them post it along with table definition and query used.

phylo
Starting Member

2 Posts

Posted - 2008-10-26 : 08:36:35

Something weird going on here.
I created the following structure from my table (dropping the irrelevant columns) with the Query Analyzer "Script Object To...":

CREATE TABLE [gameData] (
[gameId] [bigint] NOT NULL ,
[userId] [int] NOT NULL ,
[gametype] [smallint] NOT NULL ,
[gameEndTime] [datetime] NULL
) ON [PRIMARY]
GO
create unique clustered index CIX_GameData_Game_End_Time_GameId on gameData ( gameEndTime, gameId )
create nonclustered index IX_gameDate_userid_ on gameData ( userId, gametype )

My select query is:
select gameId, gameEndTime from gameData where gameEndTime < '2008/06/01' and gameType = 1 --(which performs a clustered index scan)
and
select gameId, gameEndTime from gameData where gameEndTime < '2008/06/01' --(which performs a non clustered index scan)

Thing is, with the above code, both SELECTs run on the clustered index.
I tried to duplicate my live table (with the full CREATE TABLE and the index creation) only without the data, and again both SELECTs appear to behave the same.

Under the SQL Query Analyzer, when I run the Database Object Information (ALT+F1), both tables appear the same (except for the fact that one has 22 million rows and the other has 0)

Is there another way to see how my these tables differ? Because as of now I cannot reproduce the problem I'm experiencing...

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2008-10-26 : 12:19:32

quote:
select gameId, gameEndTime from gameData where gameEndTime < '2008/06/01'

That can do a scan of the NC index because all the columns required for the query are within that index. The clustering key is present in all nonclustered indexes because it's the row's 'location'.
Since the NC index is smaller than the cluster, a scan of the NC is far faster than a scan of the cluster would be. I'm guessing that most of the data has a gameEndTime before the beginning of June. If so, the read of the entire NC index could very well require less reads than a seek of the NC index followed by a read backwards through all the leaf pages.
Not sure why the first is doing a scan of the cluster. It should also be doing a nonclustered index scan

You said you dropped the irrelevant columns. How many 'irrelevant columns' are there in the table? Are your indexes fragmented? Are your statistics up to date?

quote:
Under the SQL Query Analyzer, when I run the Database Object Information (ALT+F1), both tables appear the same (except for the fact that one has 22 million rows and the other has 0)

The optimiser is not likely to pick the same plan for a table with 22 million rows as it will for a table with 0 rows. The costs of various operations will be radically different for the two and hence the optimal plan will differ

--
Gail Shaw
SQL Server MVP

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources