Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Index on bit field + another field?

Author  Topic 

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-11-29 : 21:48:10
I realise that indexing on a bit field alone is not the right thing to do - but what about indexing a bit field + another field?

For example, in a web activity log, one would index the timestamp, but another useful index might be on a bit field "IsHTTPS" + timestamp, to quickly view all https actions. Would that be ok, or is it also inefficient?

ie.
[LogTime] [datetime] NOT NULL
[IsHTTPS] [bit] NOT NULL

INDEX [IX_ActivityLog_HTTPS] ON (IsHTTPS,LogTime)

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-11-29 : 22:41:09
If the index is covering, then a non-selective index is fine. This is directly from Kimberly Tripp. I even blogged about it: http://weblogs.sqlteam.com/tarad/archive/2010/12/15/missing-indexes-dmv-report-3-billion-impact.aspx

So it really depends on your query. If the index will "cover" your query, then yes, add it.

Do these two columns cover your entire query? I don't just mean the where clause, I mean the entire thing for that table. Post the query if you need help determining this.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-11-30 : 00:37:35
Thanks Tara. Had to look up "covering index" to see what you were asking. :) No, it's not covering, will just be used as a normal index. Query will be something like:

SELECT LogItemId, LogTime, IsHTTPS, EventDescription FROM ActivityLog WHERE (IsHTTPS = 1) AND (LogTime >= @GetFromTime)

Is that a sensible index in that context?

Also: Does SQL Server allow non-key fields in indexes?
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-11-30 : 20:02:00
Yes SQL allows non-key fields in indexes.

Use this as your index: (IsHTTPS, LogTime) INCLUDE LogItemId, EventDescription. That will cover your query. It's important that you add the include columns too.

Here's an example:
CREATE INDEX idx_SomeIndex ON Table1 (Column1, Column2) INCLUDE (Column3, Column4)

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-11-30 : 20:02:51
You may also want to try LogTime, IsHTTPS as the ordering. Try both. Compare the execution plans. It probably doesn't matter for a covering index though, but not sure. Hopefully Gail is reading this thread and will chime in.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2011-11-30 : 20:14:04
quote:
Originally posted by tkizer

You may also want to try LogTime, IsHTTPS as the ordering. Try both. Compare the execution plans. It probably doesn't matter for a covering index though, but not sure.


My gut feeling is that if there are fewer rows where
(LogTime >= @GetFromTime)
and more where:
(IsHTTPS = 1)

then LogTime as the first index key field would be better.

If this index will only be used where (IsHTTPS = 1) (i.e. you do not ALSO have queries on (IsHTTPS = 0) ) then this might do instead (and the index disk size will be smaller)

CREATE INDEX IX_ActivityLog_HTTPSON dbo.MyTable
ON (LogTime)
INCLUDE (LogItemId, EventDescription)
WHERE (IsHTTPS = 1)

I think you will have to remove [IsHTTPS] from your SELECT list for this index to be used (they are all IsHTTPS=1, so you don't need that column in the SELECT, right?!
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2011-11-30 : 20:26:26
Kristen, selectivity is irrelevant for a covering index so I highly suspect that the ordering of LogTime and IsHTTPs doesn't matter in a covering index. I can't test this right now though as I don't have SQL Server on this computer. I had this same discussion with Kimberly Tripp last year, mentioned in the blog article I linked to.

Now a filtered index would be very interesting.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2011-12-01 : 10:02:00
quote:
Originally posted by tkizer

Kristen, selectivity is irrelevant for a covering index so I highly suspect that the ordering of LogTime and IsHTTPs doesn't matter in a covering index.


Indeed, but the index will have to process a "range" - either all values where "IsHTTPS=1", or all where "LogTime >= @GetFromTime" (depending on which field is first in the index), so I was thinking that that will influence the number of Page Reads required, and thus putting the one with "fewer hits" first will improve performance (rather than selectivity)

"I don't have SQL Server on this computer"

I never thought I'd see the day where that was the case !!
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2011-12-01 : 14:39:58
For that query you want the index with IsHTTPS and then LogTime. Reason is that the predicate on LogTime is an inequality.

If the index has IsHTTPs first and LogTime second, SQL can do a seek on both columns to find the start of the range and then simple read until it runs out of rows. If LogTime is first and IsHTTPs second, then SQL can only seek to the start of the date range and will have to do a secondary filter to eliminate rows that have IsHTTPs second. It'll be less efficient than the other way around. How much depends on how many rows qualify for IsHTTPs = 1

http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/

This is true regardless of whether the index is covering or not.

--
Gail Shaw
SQL Server MVP
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2011-12-01 : 14:41:17
p.s. Before including the EventDescription column, what data type is it?

--
Gail Shaw
SQL Server MVP
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2011-12-01 : 14:44:25
quote:
Originally posted by Kristen

Indeed, but the index will have to process a "range" - either all values where "IsHTTPS=1", or all where "LogTime >= @GetFromTime"


No, it won't. SQL is quite capable of seeking on multiple columns if they're in the right order in the index, so if the index has IsHTTPs first and LogTime second then SQL will seek to the first row that qualifies for both predicates and will only have to read the range of rows qualifying for both until it runs out of rows (since the inequality is from a point to the end of the table)

--
Gail Shaw
SQL Server MVP
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2011-12-01 : 15:15:04
"SQL can do a seek on both columns to find the start of the range and then simple read until it runs out of row"

Doh! Even my pea-sized brain ought to have been able to work that out! Thanks Gail.
Go to Top of Page

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-12-01 : 15:36:07
Thanks for all the discussion, the issues raised are really helping show how one goes about making a decision on index structure. In this case, there will be quite a large number of log entries, so with a table of say 50k rows, you'd get 100k index entries - the same times appear for each of IsHTTPS = 0 and 1.

Being a "log", the data is being represented to the user in time order, so it "seems" that should be the last key. However, if you're only displaying data 1 "page" at a time - you're seeking a starting date, then really only fetching say 50 or 100 records to fill a list, so if your query was:

SELECT TOP 100 FROM ActivityLog WHERE LogTime > @StartTime AND IsHTTPS = 1 [edit:] ORDER BY LogTime

Then it may not matter if IsHTTPS even exists the index, right? The db is quickly finding the starting point then doing a row scan of max. 200 records (assuming worst case IsHTTPS 1 and 0 is evenly distributed). That operation wouldn't cost much (I assume), and the index is half the size without IsHTTPS.

For reporting purposes, however, you might want to aggregate # entries per week, split into HTTP and HTTPS.
Eg:
Week 1: HTTP = 2345 hits, HTTPS = 437 hits
etc.

Even here, LogTime seems best as the first key, and it would probably benefit from IsHTTPS as the second key to aid aggregation.

Does my reasoning there make sense? Funny how it seems IsHTTPS would be the logical first key, but then looking how the data is used suggests otherwise.
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2011-12-01 : 16:34:33
Please go and read my blog post (link above).

Your select doesn't have an order by, so you're getting any old 100 rows, I suggest you put an appropriate Order By on there.

If you don't have IsHTTPS anywhere in the index (key or include) then you're forcing key lookups for each of those rows so that SQL can do the secondary filter. It may even ignore the non-covering index and scan the table.

Nothing you've said above makes me think that the index should have LogDate as the leading column. If I was doing indexing recommendations, based on everything you've said I'd recommend an index with IsHTTPS, LogDate as the key columns. Include columns depending on the data type and how many rows are fetched.

--
Gail Shaw
SQL Server MVP
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2011-12-02 : 03:29:44
quote:
Originally posted by waveform

For reporting purposes, however, you might want to aggregate # entries per week, split into HTTP and HTTPS.
Eg:
Week 1: HTTP = 2345 hits, HTTPS = 437 hits
etc.

Even here, LogTime seems best as the first key, and it would probably benefit from IsHTTPS as the second key to aid aggregation.


To aid aggregation and filter in time you need IsHTTPS first and LogDate second.
http://sqlinthewild.co.za/index.php/2011/11/07/sql-university-advanced-indexing-sorting-and-grouping/

--
Gail Shaw
SQL Server MVP
Go to Top of Page

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-12-02 : 06:33:21
Many thanks Gail. Sorry, the ORDER BY was meant to be there, just forgot to type it.

Seems I've got a lot to learn until my logic gets logical! :)
Go to Top of Page
   

- Advertisement -