To index or not? - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

SQL Server Administration (2005)

To index or not?

Author

Topic

nic
Posting Yak Master

209 Posts

Posted - 2007-11-26 : 16:52:24

Hi,
We have a number of "lookup" tables that consist of 8 or 9 columns with the following general structure:

Identity Primary Key
Param1 varchar
Param2 varchar
Param3 float
Param4 float
Param5 varchar
Param6 date
Value1 float

These are "Read Only" tables that are never Inserted/Updated in production. The whole purpose is to return the "Value1" result. All "Param" columns are used in the "Where" clause.

For example:
Select Value1 from Table where
Param1 = '232www'
Param2 IN ('test','test3')
Param3 = 30
Param4 = 100
Param5 = 'Code'
Param6 > '1/2/2007'

These tables range in size from 1000 to 300,000 records. So the question is should I index these tables? The index would consist of all of the Param columns which is essentially the entire table. Is there any benifit to doing this? (indexes will be essentially the same size as the table)

Thanks,
Nic

Nic

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-11-26 : 16:59:14

you might try indexing only columns with high selectivity.
put the CI on the unique ever increasing colum.

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
SSMS Add-in that does a few things: www.ssmstoolspack.com

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-11-27 : 08:59:29

I assume normally all six param values are supplied when querying?
You should create a clustered unique index (I'd expect it to be the primary key) on the six parameter columns, at least. A clustered index will not even take up any additional disk space, but it should net a large increase in efficiency.

e4 d5 xd5 Nf6

anonymous1
Posting Yak Master

185 Posts

Posted - 2007-11-27 : 10:59:45

because the table is read-only (and hopefully in a read-only file group), you do not need to be concerned about performance of modifications, so you can freely add indexation until desired read performance is acheived or until disk space used becomes a concern. for the single specific example you provided a unique non-clustered index on the PK and a cluster index on Parm6 (leftmost column because of range lookup) and Parm2 (second column because multi-value lookup) would likely provide you with the best performance with minimal space increase. If there are several other search queries which are significantly different from the posted one, this index suggestions may not be the way to go.

pootle_flump

1064 Posts

Posted - 2007-11-28 : 08:54:41

I'd go for a combination of blindman and anonymous1. Single unique composite clustered index, first column Param6 then just the most selective columns after this. A clustered index will take up space (what about the index nodes?) but unless you are tight for space you should not care.

pootle_flump

1064 Posts

Posted - 2007-11-28 : 08:57:17

BTW - why are you using floats?

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-11-28 : 09:51:21

quote:
Originally posted by pootle_flump

A clustered index will take up space (what about the index nodes?) but unless you are tight for space you should not care.

A clustered index designates the order in which the actual data is stored, and so does not require additional space. Create a composite clustered unique index (primary key) on the six param columns and you should be fine.

e4 d5 xd5 Nf6

anonymous1
Posting Yak Master

185 Posts

Posted - 2007-11-28 : 10:55:47

blindman is correct about clustered index, if it is the one and only index on the table. each additional index needs to include a row identifier which is the cluster key (size changes with number & datatypes of columns) if it is unique or the cluster key + 4 bytes when cluster is not unique. if you have more than the cluster index and you change the key of the cluster index to be wider, those other indexes will be wider too, therefore more disk spaced used.

pootle_flump

1064 Posts

Posted - 2007-11-28 : 11:12:50

quote:
Originally posted by blindman

quote:
Originally posted by pootle_flump

A clustered index will take up space (what about the index nodes?) but unless you are tight for space you should not care.
A clustered index designates the order in which the actual data is stored, and so does not require additional space. Create a composite clustered unique index (primary key) on the six param columns and you should be fine.

e4 d5 xd5 Nf6

The B-Tree requires space. Anyway - it was a minor quibble.

pootle_flump

1064 Posts

Posted - 2007-11-28 : 11:40:05

Well that's interesting. Obvious when you think about it but I hadn't expected a clustered index B-Tree to be so small relative to the table. It does take up some additional space though

USE tempdb
GO

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_heap') BEGIN
	DROP TABLE dbo.my_heap
END

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_ci') BEGIN
	DROP TABLE dbo.my_ci
END

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_ci') BEGIN
	DROP TABLE dbo.my_nci
END

CREATE TABLE dbo.my_heap
	(
		col_1	VARCHAR(100)	
		, col_2	INT
	)
GO	

CREATE TABLE dbo.my_ci
	(
		col_1	VARCHAR(100)	
		, col_2	INT
		, CONSTRAINT pk_my_ci	PRIMARY KEY CLUSTERED (col_2, col_1) WITH (FILLFACTOR = 100, PAD_INDEX = OFF)
	)
GO

CREATE TABLE dbo.my_nci
	(
		col_1	VARCHAR(100)	
		, col_2	INT
		, CONSTRAINT pk_my_nci	PRIMARY KEY NONCLUSTERED (col_2, col_1) WITH (FILLFACTOR = 100, PAD_INDEX = OFF)
	)
GO

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N't_name') BEGIN
	DROP TABLE dbo.t_name
END

INSERT INTO dbo.my_heap	(col_1, col_2)
SELECT	TOP 65536 REPLICATE('X', 100), number
FROM	dbo.numbers

INSERT INTO dbo.my_ci	(col_1, col_2)
SELECT	TOP 65536 REPLICATE('X', 100), number
FROM	dbo.numbers

INSERT INTO dbo.my_nci	(col_1, col_2)
SELECT	TOP 65536 REPLICATE('X', 100), number
FROM	dbo.numbers

EXEC sp_spaceused N'dbo.my_heap'
EXEC sp_spaceused N'dbo.my_ci'
EXEC sp_spaceused N'dbo.my_nci'

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_heap') BEGIN
	DROP TABLE dbo.my_heap
END

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_ci') BEGIN
	DROP TABLE dbo.my_ci
END

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_ci') BEGIN
	DROP TABLE dbo.my_nci
END

anonymous1
Posting Yak Master

185 Posts

Posted - 2007-11-28 : 12:17:17

sorry to all, wasn't trying to attack. just wanted nic to know clustering 6 columns on a 300K table could use up space depending on the number of indexes he will eventually create.

pootle_flump

1064 Posts

Posted - 2007-11-28 : 12:36:15

quote:
Originally posted by anonymous1

sorry to all, wasn't trying to attack.

Lol - you've obviously never seen blindman in action then have you?

I was curious too - I've never built a CI containing all rows from a table so did not realise how compact the b-tree is.

anonymous1
Posting Yak Master

185 Posts

Posted - 2007-11-28 : 13:06:07

like you said the b-tree is almost inconsequential (more so with more selectivity) it is the leaf level which has the row identifier and the index columns that make up the size of the index

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-11-28 : 15:40:08

It takes more than that to get my dander up. And I usually give some advanced warning, like a rattlesnake, growling dog, or the stall-warning on an aircraft.
But the point you made about the large clustered index requiring additional space on non-clustered indexes was valid. 'Nuther reason to like surrogate keys...

e4 d5 xd5 Nf6

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources