Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 To index or not?

Author  Topic 

nic
Posting Yak Master

209 Posts

Posted - 2007-11-26 : 16:52:24
Hi,
We have a number of "lookup" tables that consist of 8 or 9 columns with the following general structure:

Identity Primary Key
Param1 varchar
Param2 varchar
Param3 float
Param4 float
Param5 varchar
Param6 date
Value1 float

These are "Read Only" tables that are never Inserted/Updated in production. The whole purpose is to return the "Value1" result. All "Param" columns are used in the "Where" clause.

For example:
Select Value1 from Table where
Param1 = '232www'
Param2 IN ('test','test3')
Param3 = 30
Param4 = 100
Param5 = 'Code'
Param6 > '1/2/2007'

These tables range in size from 1000 to 300,000 records. So the question is should I index these tables? The index would consist of all of the Param columns which is essentially the entire table. Is there any benifit to doing this? (indexes will be essentially the same size as the table)

Thanks,
Nic

Nic

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2007-11-26 : 16:59:14
you might try indexing only columns with high selectivity.
put the CI on the unique ever increasing colum.

_______________________________________________
Causing trouble since 1980
blog: http://weblogs.sqlteam.com/mladenp
SSMS Add-in that does a few things: www.ssmstoolspack.com
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-11-27 : 08:59:29
I assume normally all six param values are supplied when querying?
You should create a clustered unique index (I'd expect it to be the primary key) on the six parameter columns, at least. A clustered index will not even take up any additional disk space, but it should net a large increase in efficiency.

e4 d5 xd5 Nf6
Go to Top of Page

anonymous1
Posting Yak Master

185 Posts

Posted - 2007-11-27 : 10:59:45
because the table is read-only (and hopefully in a read-only file group), you do not need to be concerned about performance of modifications, so you can freely add indexation until desired read performance is acheived or until disk space used becomes a concern. for the single specific example you provided a unique non-clustered index on the PK and a cluster index on Parm6 (leftmost column because of range lookup) and Parm2 (second column because multi-value lookup) would likely provide you with the best performance with minimal space increase. If there are several other search queries which are significantly different from the posted one, this index suggestions may not be the way to go.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-11-28 : 08:54:41
I'd go for a combination of blindman and anonymous1. Single unique composite clustered index, first column Param6 then just the most selective columns after this. A clustered index will take up space (what about the index nodes?) but unless you are tight for space you should not care.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-11-28 : 08:57:17
BTW - why are you using floats?
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-11-28 : 09:51:21
quote:
Originally posted by pootle_flump

A clustered index will take up space (what about the index nodes?) but unless you are tight for space you should not care.
A clustered index designates the order in which the actual data is stored, and so does not require additional space. Create a composite clustered unique index (primary key) on the six param columns and you should be fine.

e4 d5 xd5 Nf6
Go to Top of Page

anonymous1
Posting Yak Master

185 Posts

Posted - 2007-11-28 : 10:55:47
blindman is correct about clustered index, if it is the one and only index on the table. each additional index needs to include a row identifier which is the cluster key (size changes with number & datatypes of columns) if it is unique or the cluster key + 4 bytes when cluster is not unique. if you have more than the cluster index and you change the key of the cluster index to be wider, those other indexes will be wider too, therefore more disk spaced used.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-11-28 : 11:12:50
quote:
Originally posted by blindman

quote:
Originally posted by pootle_flump

A clustered index will take up space (what about the index nodes?) but unless you are tight for space you should not care.
A clustered index designates the order in which the actual data is stored, and so does not require additional space. Create a composite clustered unique index (primary key) on the six param columns and you should be fine.

e4 d5 xd5 Nf6

The B-Tree requires space. Anyway - it was a minor quibble.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-11-28 : 11:40:05
Well that's interesting. Obvious when you think about it but I hadn't expected a clustered index B-Tree to be so small relative to the table. It does take up some additional space though

USE tempdb
GO

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_heap') BEGIN
DROP TABLE dbo.my_heap
END

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_ci') BEGIN
DROP TABLE dbo.my_ci
END

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_ci') BEGIN
DROP TABLE dbo.my_nci
END

CREATE TABLE dbo.my_heap
(
col_1 VARCHAR(100)
, col_2 INT
)
GO

CREATE TABLE dbo.my_ci
(
col_1 VARCHAR(100)
, col_2 INT
, CONSTRAINT pk_my_ci PRIMARY KEY CLUSTERED (col_2, col_1) WITH (FILLFACTOR = 100, PAD_INDEX = OFF)
)
GO

CREATE TABLE dbo.my_nci
(
col_1 VARCHAR(100)
, col_2 INT
, CONSTRAINT pk_my_nci PRIMARY KEY NONCLUSTERED (col_2, col_1) WITH (FILLFACTOR = 100, PAD_INDEX = OFF)
)
GO

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N't_name') BEGIN
DROP TABLE dbo.t_name
END

INSERT INTO dbo.my_heap (col_1, col_2)
SELECT TOP 65536 REPLICATE('X', 100), number
FROM dbo.numbers

INSERT INTO dbo.my_ci (col_1, col_2)
SELECT TOP 65536 REPLICATE('X', 100), number
FROM dbo.numbers

INSERT INTO dbo.my_nci (col_1, col_2)
SELECT TOP 65536 REPLICATE('X', 100), number
FROM dbo.numbers

EXEC sp_spaceused N'dbo.my_heap'
EXEC sp_spaceused N'dbo.my_ci'
EXEC sp_spaceused N'dbo.my_nci'

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_heap') BEGIN
DROP TABLE dbo.my_heap
END

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_ci') BEGIN
DROP TABLE dbo.my_ci
END

IF EXISTS (SELECT NULL FROM sys.tables WHERE name = N'my_ci') BEGIN
DROP TABLE dbo.my_nci
END
Go to Top of Page

anonymous1
Posting Yak Master

185 Posts

Posted - 2007-11-28 : 12:17:17
sorry to all, wasn't trying to attack. just wanted nic to know clustering 6 columns on a 300K table could use up space depending on the number of indexes he will eventually create.
Go to Top of Page

pootle_flump

1064 Posts

Posted - 2007-11-28 : 12:36:15
quote:
Originally posted by anonymous1

sorry to all, wasn't trying to attack.
Lol - you've obviously never seen blindman in action then have you?
I was curious too - I've never built a CI containing all rows from a table so did not realise how compact the b-tree is.
Go to Top of Page

anonymous1
Posting Yak Master

185 Posts

Posted - 2007-11-28 : 13:06:07
like you said the b-tree is almost inconsequential (more so with more selectivity) it is the leaf level which has the row identifier and the index columns that make up the size of the index
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-11-28 : 15:40:08
It takes more than that to get my dander up. And I usually give some advanced warning, like a rattlesnake, growling dog, or the stall-warning on an aircraft.
But the point you made about the large clustered index requiring additional space on non-clustered indexes was valid. 'Nuther reason to like surrogate keys...

e4 d5 xd5 Nf6
Go to Top of Page
   

- Advertisement -