| Author |
Topic |
|
waveform
Yak Posting Veteran
93 Posts |
Posted - 2011-11-29 : 21:48:10
|
I realise that indexing on a bit field alone is not the right thing to do - but what about indexing a bit field + another field?For example, in a web activity log, one would index the timestamp, but another useful index might be on a bit field "IsHTTPS" + timestamp, to quickly view all https actions. Would that be ok, or is it also inefficient?ie.[LogTime] [datetime] NOT NULL[IsHTTPS] [bit] NOT NULLINDEX [IX_ActivityLog_HTTPS] ON (IsHTTPS,LogTime) |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
waveform
Yak Posting Veteran
93 Posts |
Posted - 2011-11-30 : 00:37:35
|
| Thanks Tara. Had to look up "covering index" to see what you were asking. :) No, it's not covering, will just be used as a normal index. Query will be something like:SELECT LogItemId, LogTime, IsHTTPS, EventDescription FROM ActivityLog WHERE (IsHTTPS = 1) AND (LogTime >= @GetFromTime)Is that a sensible index in that context?Also: Does SQL Server allow non-key fields in indexes? |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-11-30 : 20:14:04
|
quote: Originally posted by tkizer You may also want to try LogTime, IsHTTPS as the ordering. Try both. Compare the execution plans. It probably doesn't matter for a covering index though, but not sure.
My gut feeling is that if there are fewer rows where (LogTime >= @GetFromTime)and more where:(IsHTTPS = 1)then LogTime as the first index key field would be better.If this index will only be used where (IsHTTPS = 1) (i.e. you do not ALSO have queries on (IsHTTPS = 0) ) then this might do instead (and the index disk size will be smaller)CREATE INDEX IX_ActivityLog_HTTPSON dbo.MyTableON (LogTime) INCLUDE (LogItemId, EventDescription)WHERE (IsHTTPS = 1) I think you will have to remove [IsHTTPS] from your SELECT list for this index to be used (they are all IsHTTPS=1, so you don't need that column in the SELECT, right?! |
 |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2011-11-30 : 20:26:26
|
| Kristen, selectivity is irrelevant for a covering index so I highly suspect that the ordering of LogTime and IsHTTPs doesn't matter in a covering index. I can't test this right now though as I don't have SQL Server on this computer. I had this same discussion with Kimberly Tripp last year, mentioned in the blog article I linked to.Now a filtered index would be very interesting.Tara KizerMicrosoft MVP for Windows Server System - SQL Serverhttp://weblogs.sqlteam.com/tarad/Subscribe to my blog |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-12-01 : 10:02:00
|
quote: Originally posted by tkizer Kristen, selectivity is irrelevant for a covering index so I highly suspect that the ordering of LogTime and IsHTTPs doesn't matter in a covering index.
Indeed, but the index will have to process a "range" - either all values where "IsHTTPS=1", or all where "LogTime >= @GetFromTime" (depending on which field is first in the index), so I was thinking that that will influence the number of Page Reads required, and thus putting the one with "fewer hits" first will improve performance (rather than selectivity)"I don't have SQL Server on this computer"I never thought I'd see the day where that was the case !! |
 |
|
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-12-01 : 14:39:58
|
| For that query you want the index with IsHTTPS and then LogTime. Reason is that the predicate on LogTime is an inequality.If the index has IsHTTPs first and LogTime second, SQL can do a seek on both columns to find the start of the range and then simple read until it runs out of rows. If LogTime is first and IsHTTPs second, then SQL can only seek to the start of the date range and will have to do a secondary filter to eliminate rows that have IsHTTPs second. It'll be less efficient than the other way around. How much depends on how many rows qualify for IsHTTPs = 1http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/This is true regardless of whether the index is covering or not.--Gail ShawSQL Server MVP |
 |
|
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-12-01 : 14:41:17
|
| p.s. Before including the EventDescription column, what data type is it?--Gail ShawSQL Server MVP |
 |
|
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-12-01 : 14:44:25
|
quote: Originally posted by Kristen Indeed, but the index will have to process a "range" - either all values where "IsHTTPS=1", or all where "LogTime >= @GetFromTime"
No, it won't. SQL is quite capable of seeking on multiple columns if they're in the right order in the index, so if the index has IsHTTPs first and LogTime second then SQL will seek to the first row that qualifies for both predicates and will only have to read the range of rows qualifying for both until it runs out of rows (since the inequality is from a point to the end of the table)--Gail ShawSQL Server MVP |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-12-01 : 15:15:04
|
| "SQL can do a seek on both columns to find the start of the range and then simple read until it runs out of row"Doh! Even my pea-sized brain ought to have been able to work that out! Thanks Gail. |
 |
|
|
waveform
Yak Posting Veteran
93 Posts |
Posted - 2011-12-01 : 15:36:07
|
| Thanks for all the discussion, the issues raised are really helping show how one goes about making a decision on index structure. In this case, there will be quite a large number of log entries, so with a table of say 50k rows, you'd get 100k index entries - the same times appear for each of IsHTTPS = 0 and 1.Being a "log", the data is being represented to the user in time order, so it "seems" that should be the last key. However, if you're only displaying data 1 "page" at a time - you're seeking a starting date, then really only fetching say 50 or 100 records to fill a list, so if your query was:SELECT TOP 100 FROM ActivityLog WHERE LogTime > @StartTime AND IsHTTPS = 1 [edit:] ORDER BY LogTimeThen it may not matter if IsHTTPS even exists the index, right? The db is quickly finding the starting point then doing a row scan of max. 200 records (assuming worst case IsHTTPS 1 and 0 is evenly distributed). That operation wouldn't cost much (I assume), and the index is half the size without IsHTTPS.For reporting purposes, however, you might want to aggregate # entries per week, split into HTTP and HTTPS.Eg:Week 1: HTTP = 2345 hits, HTTPS = 437 hitsetc.Even here, LogTime seems best as the first key, and it would probably benefit from IsHTTPS as the second key to aid aggregation.Does my reasoning there make sense? Funny how it seems IsHTTPS would be the logical first key, but then looking how the data is used suggests otherwise. |
 |
|
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-12-01 : 16:34:33
|
| Please go and read my blog post (link above). Your select doesn't have an order by, so you're getting any old 100 rows, I suggest you put an appropriate Order By on there.If you don't have IsHTTPS anywhere in the index (key or include) then you're forcing key lookups for each of those rows so that SQL can do the secondary filter. It may even ignore the non-covering index and scan the table.Nothing you've said above makes me think that the index should have LogDate as the leading column. If I was doing indexing recommendations, based on everything you've said I'd recommend an index with IsHTTPS, LogDate as the key columns. Include columns depending on the data type and how many rows are fetched.--Gail ShawSQL Server MVP |
 |
|
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2011-12-02 : 03:29:44
|
quote: Originally posted by waveform For reporting purposes, however, you might want to aggregate # entries per week, split into HTTP and HTTPS.Eg:Week 1: HTTP = 2345 hits, HTTPS = 437 hitsetc.Even here, LogTime seems best as the first key, and it would probably benefit from IsHTTPS as the second key to aid aggregation.
To aid aggregation and filter in time you need IsHTTPS first and LogDate second. http://sqlinthewild.co.za/index.php/2011/11/07/sql-university-advanced-indexing-sorting-and-grouping/--Gail ShawSQL Server MVP |
 |
|
|
waveform
Yak Posting Veteran
93 Posts |
Posted - 2011-12-02 : 06:33:21
|
| Many thanks Gail. Sorry, the ORDER BY was meant to be there, just forgot to type it.Seems I've got a lot to learn until my logic gets logical! :) |
 |
|
|
|