Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 Index Performance Tuning Help

Author  Topic 

BaggaDonuts
Yak Posting Veteran

52 Posts

Posted - 2007-10-22 : 18:47:30
I'm using the Database Engine Tuning Advisor to do some performance evaluation on my database. I have one particular table that will potentially have a couple million rows. but this may not occur for a few months to a year from now.

when i run the advisor with 1.5 million rows, it recommends that i add an index across all columns in this table (covering index) for an estimated 91% improvement. when i run it with 1500 rows in it, it recommends that i add an index on 2 key columns for an 8% improvement.

how should i reconcile this? can i have both and what does that do to performance?

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2007-10-22 : 19:05:41
You shouldn't even worry about a table with 1500 rows in it. The optimizer will often times find it faster to do a table scan/clustered index scan.

All an index that covers all columns in a table sounds like a bad idea.

To figure out what you should do, load up a test database that will contain the number of rows that you expect in one year. Then plan with the indexes and queries to see what works best.

Can you post the more frequent select statements that will occur across this table so that we can make an educated guess on what the indexes should be and you can then compare our suggestions to the advisor's suggestions.

As a side note, I almost never use the advisor as it typically does not produce realistic results. We use a performance environment to figure out what to do. In the environment we have tools that can load data and simulate production access.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

BaggaDonuts
Yak Posting Veteran

52 Posts

Posted - 2007-10-22 : 19:20:10
Thanks Tara. It was much faster after adding the recommended index with 1.5 million rows in it. it's not quite all columns, as you'll see below. here's some more info (forgive the length here):

Table Structure of the table that will contain the large number of rows:

CREATE TABLE [dbo].[t_AtomVersion](
[AtomVersionId] [bigint] IDENTITY(1,1) NOT NULL,
[AtomId] [int] NOT NULL,
[StructureVersion] [int] NOT NULL,
[ContentVersion] [int] NOT NULL,
[CreatedDate] [datetime] NOT NULL CONSTRAINT [DF_t_AtomVersion_CreatedDate] DEFAULT (getdate()),
[AtomVersionStatusId] [smallint] NOT NULL,
[VersionDescription] [varchar](256) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[AssemblySize] [bigint] NOT NULL CONSTRAINT [DF_t_AtomVersion_FileSize] DEFAULT ((0)),
CONSTRAINT [PK_t_AtomVersion] PRIMARY KEY CLUSTERED
(
[AtomVersionId] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY],
CONSTRAINT [UC_t_AtomVersion] UNIQUE NONCLUSTERED
(
[AtomId] ASC,
[StructureVersion] ASC,
[ContentVersion] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF
GO
USE [Packaging]
GO
ALTER TABLE [dbo].[t_AtomVersion] WITH CHECK ADD CONSTRAINT [FK_t_AtomVersion_t_Atom] FOREIGN KEY([AtomId])
REFERENCES [dbo].[t_Atom] ([AtomId])
GO
ALTER TABLE [dbo].[t_AtomVersion] WITH CHECK ADD CONSTRAINT [FK_t_AtomVersion_t_AtomVersionStatus] FOREIGN KEY([AtomVersionStatusId])
REFERENCES [dbo].[t_AtomVersionStatus] ([AtomVersionStatusId])


recommended index by the tuning advisor:

CREATE NONCLUSTERED INDEX [IX_t_AtomVersion_CoveringIndex] ON [dbo].[t_AtomVersion]
(
[AtomId] ASC,
[AtomVersionId] ASC,
[StructureVersion] ASC,
[AtomVersionStatusId] ASC,
[AssemblySize] ASC
)
INCLUDE ( [ContentVersion]) WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
go


Finally, here's a snippet of the stored procedure i'll be running QUITE FREQUENTLY. there may be much better ways of tweaking performance on this query, but that's a separate issue i guess.


select A.AtomId, AT.AtomTypeKey, SQ.StructureVersion, max(CV.ContentVersion) as ContentVersion, SQ.AssemblySize
from dbo.t_AtomVersion CV with(nolock)
inner join (
select AP.AtomId, SV.StructureVersion, SV.AssemblySize
from @atomPriority AP
inner join
(
select AtomId, AssemblySize, max(StructureVersion) as StructureVersion
from dbo.t_AtomVersion AV with(nolock)
inner join dbo.t_AtomVersionStatus SVS with(nolock) on SVS.AtomVersionStatusId = AV.AtomVersionStatusId
where SVS.Status = 'release'
group by AtomId, AssemblySize

) SV on AP.AtomId = SV.AtomId
) SQ on SQ.AtomId = CV.AtomId and SQ.StructureVersion = CV.StructureVersion
inner join dbo.t_AtomVersionStatus CVS with(nolock) on CVS.AtomVersionStatusId = CV.AtomVersionStatusId
inner join dbo.t_Atom A with(nolock) on A.AtomId = SQ.AtomId
inner join dbo.t_AtomType AT with(nolock) on AT.AtomTypeId = A.AtomTypeId
where CVS.Status = 'release'
group by A.AtomId, SQ.StructureVersion, AT.AtomTypeKey, SQ.AssemblySize


Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2007-10-22 : 19:23:53
Wow. I just don't have the time to work on such a large query without getting paid for it. I just wasn't expecting it to be this large or complex.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

BaggaDonuts
Yak Posting Veteran

52 Posts

Posted - 2007-10-22 : 19:30:01
haha yeah i figured that was going to be the response before i posted it :) probably one of the more complex problems i've worked on. thanks anyway.
Go to Top of Page

BaggaDonuts
Yak Posting Veteran

52 Posts

Posted - 2007-10-22 : 19:36:24
to close, after loading the test database with 1.5 million records with the suggested index, it went from taking 4 seconds to run to less than a second. i'll probably just go with the recommended index for the biggest improvement.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-23 : 02:47:04
What's with the nested sub select (i haven;t read it carefully). Is that just to match the MAX value in each group? because if so there are better ways of doing that in SQL 2005.

And all those NOLOCKs are a REALLY bad idea (or are due to really bad design) on a production system.

Kristen
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-10-23 : 03:05:26
Why is NOLOCK bad on a production system?
I think it depends on the usage of system, right?

Anyway, try this kind of SQL Server 2005 query
SELECT	AtomId,
AssemblySize,
tructureVersion,
ContentVersion
FROM (
SELECT av.AtomId,
av.AssemblySize,
a.StructureVersion,
av.ContentVersion,
ROW_NUMBER() OVER (PARTITION BY av.AtomID, av.AssemblySize ORDER BY av.StructureVersion DESC) AS StructureVersionRecID
ROW_NUMBER() OVER (PARTITION BY av.AtomID, at.AtomTypeKey ORDER BY av.ContentVersion DESC) AS ContentVersionRecID
FROM dbo.t_AtomVersion AS av WITH (NOLOCK)
INNER JOIN @AtomPriority AS ap ON ap.AtomID = av.AtomID
INNER JOIN dbo.t_AtomVersionStatus AS svs WITH (NOLOCK) ON svs.AtomVersionStatusId = av.AtomVersionStatusId
INNER JOIN dbo.t_Atom AS a WITH (NOLOCK) ON a.AtomId = av.AtomId
INNER JOIN dbo.t_AtomType AS at WITH (NOLOCK) ON at.AtomTypeId = a.AtomTypeId
WHERE svs.Status = 'Release'
AND ap.StructureVersion = av.StructureVersion
) AS d
WHERE StructureVersionRecID = 1
AND ContentVersionRecID = 1



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-10-23 : 03:19:23
I am just raising the question because I am currently working with a database where there are about 4800 simultaneous users.
All upserts are made during 1 am to 3 am with tablock and no users logged on.
Most tables have about 4-5 million records and there are about 200 tables.

Adding NOLOCK really made a huge difference for us.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-23 : 04:25:45
"Why is NOLOCK bad on a production system?
I think it depends on the usage of system, right?
"

Oh for sure you can design it in, but that has never been the case in the systems I have come across and I expect it is as rare as Hen's Teeth in the questions that are asked here.

The problem is with reading dirty data, so information is displayed to the user which may not be correct, and which they then act on. Maybe click a link and get an error. They report the error, its unreproducible, no one can explain what happened.

or they sell $1,000,000 of stock - that they didn't actually hold.

or the index does a page-split and the query fails with error. Again almost impossible to reproduce (and that's assuming that the DEVs actually know what's going on).

"All upserts are made during 1 am to 3 am with tablock and no users logged on."

Sounds like the database should be READ ONLY the rest of the time, or "set transaction isolation level read uncommitted"

I did a search for you , I don't see anyone debunking the "never use" rule, only "relaxing the rule in very specific circumstances":

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=24119
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=77064
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=23386
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=35132
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=70297
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=84906#312986

Kristen
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2007-10-23 : 05:25:28
Read-only would have been cool.
But statistics is not updated when in read-only mode.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-23 : 05:29:05
if there are NO updates (by users during their use of the App) then I reckon you need

set transaction isolation level read uncommitted

or similar.

Or put that data in a separate database (i.e. all "read only" data) and access that "uncommitted")

Kristen
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-10-25 : 11:11:17
The covering index looks right given the query. However, you should also test the impact this has on inserts, updates and deletes - this covering index covers most of the table columns and so should have a bit of an impact on these operations.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-26 : 05:53:32
"this covering index covers most of the table columns and so should have a bit of an impact on these operations"

But its only going to be considered based on the first key column, thus only when that is suitable and sufficiently "selective", isn't it?

Kristen
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-10-26 : 06:28:49
The operations I meant were update, delete, insert. Since the index is not that much smaller than the table then he can expect a pretty big hit on data changes. It looks ok to me as far as the query itself goes. For covering indexes the selectivity of the first column is not (always) such a big issue. For this query there were a couple of candidate columns for the leftmost (AtomVersionStatusId, AtomId). Atomid is probably the most sensible but I could imagine the other working ok too depending on the data.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-26 : 06:33:05
"For covering indexes the selectivity of the first column is not (always) such a big issue"

I was meaning to clarify that first the index has to be used before it can cover anything!

There is sometimes a mistaken belief that "putting everything in an index" means that it is useful for all scenarios, rather than having carefully considered indexes for various specific jobs, and then overloading those indexes, as appropriate, so that they better cover the query(s) they assist with.

Kristen
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-10-26 : 08:52:08
Ah I see and completely agree.

I think it would use the covering index no matter how selective the first column, though, so long as it covers the query. The first column would affect just how well it could use the index though. Whatever the selectivity, nci scan < ci scan (... almost always).
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-26 : 12:46:23
I don't know well enough to be confident, but my thinking is that first it picks an index that solves the query (most cheaply) and then if that index has covering columns they are used for elsewhere in the query, if useful.

I'll do a test when I have a moment. Simple test might be to rearrange that index so something "less attractive" was the first-key, and see if it is still used.

Kristen
Go to Top of Page
   

- Advertisement -