Optimize date search; rows are already in 'order'

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

Optimize date search; rows are already in 'order'

Author

Topic

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-06-19 : 11:33:56

I have a table that records dates/times of people who log in to a system. The date field is a DateTime type, and a row is added with the current date/time (and userID) whenever someone logs in.
This table currently has about 100,000 rows, and queries on the table are performed using a stored procedure that accepts two parameters: @DateTo and @DateFrom. Given the following simplistic example...


-- @DateFrom is today's date less 3 months
-- @DateTo is GetUTCDate()
SELECT * FROM myTable
WHERE
 (loginDate BETWEEN @DateFrom AND @DateTo)

...without any optimization on the table, I assume that SQL unnecessarily searches all 100,000 rows??

Because the latest date recorded is always the last row in the table, if I search using the date range above, this seems a waste of resources (excuse my ignorance please!).
Would an index on the date column suffice to stop SQL searching ALL rows? I assume there would still be wasted resources because the whole index would need to be scanned?
Is there another way to intelligently stop it searching beyond the row that contains the earliest @dateFrom value?

Any comments are most welcome!

jimf
Master Smack Fu Yak Hacker

2875 Posts

Posted - 2012-06-19 : 11:41:44

An index on loginDate may help. Does the table have a clustered index? If not, loginDate might be a good choice for a clustered index. However, since you're getting all columns out of the table, the optimizer might just say "It's easier if I just do a table scan".

Jim

Everyday I learn something that somebody else already knew

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-06-19 : 12:13:19

Hi Jim

Yes there's already a clustered index on the ID column. We use this to delete rows (if required - but not often). There are only 3 columns in the table (ID, userID, loginDate) so if you can suggest a better way to optimize the table I'd be grateful. It's almost as if I need to tell SQL how to 'behave'...!!

jimf
Master Smack Fu Yak Hacker

2875 Posts

Posted - 2012-06-19 : 12:19:49

Yes, you absolutley have to tell SQL how to behave, or at least give it enough info to make the right choices. I'd create an index on loginDate and include userID
CREATE INDEX IDX ON yourTable(logindate) INCLUDE(UserID). The only sure-fire way to know what the best way is is to test.

Jim

Everyday I learn something that somebody else already knew

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-06-19 : 12:45:38

I would make ID PRIMARY KEY NONCLUSTERED and LoginDate, UserID the unique clustered index, since LoginDate is everincreasing.

N 56°04'39.26"
E 12°55'05.63"

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-02 : 02:43:01

quote:
Originally posted by SwePeso

I would make ID PRIMARY KEY NONCLUSTERED and LoginDate, UserID the unique clustered index, since LoginDate is everincreasing.

N 56°04'39.26"
E 12°55'05.63"

Thanks for that. Sorry for the late reply; I've been on holiday.

By using UserID AND LoginDate as the unique clustered index, that gets around the issue of two people logging in at EXACTLY the same time right (highly improbable but not impossible I guess)??

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-07-02 : 16:05:08

Yes it does.

N 56°04'39.26"
E 12°55'05.63"

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-04 : 04:32:54

quote:
Originally posted by SwePeso

Yes it does.

N 56°04'39.26"
E 12°55'05.63"

Hi and thanks again. I'm learning something new here...

After studying your replies, please see my further questions(!)...

The report I originally mentioned works slightly differently in certain cases (for a specific customer who should only see their own users):


-- @DateFrom is today's date less 3 months
-- @DateTo is GetUTCDate()
-- @tbl_UserIDs is a table variable containing userID's to find login details for
SELECT 
  l.*,
  u.name,
  u.surname
FROM 
  [tbl_LoginDetails] l INNER JOIN
  [tbl_Users] u ON l.userID = u.userID INNER JOIN
  @tbl_userIDs i ON i.userID = u.userID -- restrict to certain users
WHERE
  (u.customerID = @customerID) AND
  (l.loginDate BETWEEN @DateFrom AND @DateTo)

Please correct me if I'm wrong here, but if the new clustered index uses [LoginDate] and [userID] in that order, then it's not possible to use the index to look-up userID for the table join, correct???

If so, would I then need a third non-clustered index on [userID]?

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-11 : 08:49:40

Hi guys
Sorry to pester you but if you could please answer my last few points I'd be very grateful. Many thanks.

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2012-07-11 : 11:23:35

quote:
Originally posted by R

After studying your replies, please see my further questions(!)...

<snip>
Please correct me if I'm wrong here, but if the new clustered index uses [LoginDate] and [userID] in that order, then it's not possible to use the index to look-up userID for the table join, correct???

If so, would I then need a third non-clustered index on [userID]?

SQL proably could not seek on UserID in that case (assuming you are also not using the LogonDate). So, if you your predicate only contained UserID, then most likely you'd need a separate index for that.

also

quote:
Originally posted by R

quote:
Originally posted by SwePeso

I would make ID PRIMARY KEY NONCLUSTERED and LoginDate, UserID the unique clustered index, since LoginDate is everincreasing.

N 56°04'39.26"
E 12°55'05.63"

Thanks for that. Sorry for the late reply; I've been on holiday.

By using UserID AND LoginDate as the unique clustered index, that gets around the issue of two people logging in at EXACTLY the same time right (highly improbable but not impossible I guess)??

Just to be clear, it prevents the SAME Logon from logging in more that once within accuracy of that datatype (roughly .003 seconds for a datetime). I wasn't sure if you meant two people with the same logon or not. :)

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-17 : 03:33:15

Okay I've been experimenting with this. Not sure if I did something incorrectly, but I cleared out the two existing indexes (one clustered and one non-clustered), and the primary key was also lost. Is that expected, the clustered index and the primary key being the same thing?

If so, should I rebuild the primary key as two columns (loginDate, UserID)?

Thanks again

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-07-17 : 07:39:04

If you want to identify the login attempt, you probably have an identitiy column. Make this the primary key.
Then make the unique clustered index over {LoginDate, UserID}.

N 56°04'39.26"
E 12°55'05.63"

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-17 : 09:38:13

Hi

The columns in the table are as follows:


[ID]         int IDENTITY(1,1) PRIMARY KEY
[LoginDate]  datetime NOT NULL
[UserID]     int NOT NULL

At present (before I've applied any recommended changes above), I have the following two indexes:


[IX_tbl_Logins_userID_loginDate]  {userID, loginDate}   Non-unique, Non-clustered
[PK_tbl_GoodLoginDetails]         {ID}                  Clustered

As soon as I remove the Clustered index, the primary key setting is lost on the [ID] column. It seems the two are linked.
So using SSMS with Profiler running, I set the PK to be both the LoginDate & userID columns, which automatically generated a unique clustered index on them. I then manually added a non-clustered index on ID to allow me to delete entries if required (not often).

Profile generated the following code:

ALTER TABLE tbl_Logins ADD CONSTRAINT
	PK_tbl_Logins PRIMARY KEY CLUSTERED 
	(
	loginDate,
	userID
	) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_tbl_Logins_ID] ON [dbo].[tbl_Logins] 
(
	[ID] ASC
) WITH (STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

Does that resemble what you advised? I'm unsure about the Primary Key changing!
Thanks.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-07-17 : 17:07:07

[code]ALTER TABLE tbl_Logins ADD CONSTRAINT
PK_tbl_Logins PRIMARY KEY NONCLUSTERED
(
ID
)
GO

CREATE UNIQUE CLUSTERED INDEX [UCX_tbl_Logins_loginDate_userID] ON [dbo].[tbl_Logins]
(
[loginDate] ASC, userID ASC
)
GO[/code]

N 56°04'39.26"
E 12°55'05.63"

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-18 : 02:40:59

Great, thanks very much SwePeso.

Just one last question... In your final example you've specified the index columns as

[loginDate] ASC, userID ASC

Is a direction (ASC/DESC) always required when specifying an index? I'm just thinking that if two users logged in at exactly the same time, but the second userID was lower than the first, would this cause a problem?

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-07-18 : 08:11:24

ASC is the default sort order. I just added it by habit.
No.

N 56°04'39.26"
E 12°55'05.63"

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-18 : 09:01:52

Super - thanks very much for your help and guidance

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-07-18 : 11:19:25

Just to give you a little feedback, using the new index structure has provided me with an 800% performance gain! You guys are awesome - thanks again!

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-07-18 : 13:36:27

Thank you for your feedback. Much appreciated.

N 56°04'39.26"
E 12°55'05.63"

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources