Help for Covering index - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Administration

Help for Covering index

Author

Topic

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-23 : 07:13:40

In Execution plan I found there is KeyLookup Cost (94%) on Time Table for a query.
so I beleive I need to create Covering Index on Time Table.
When I found the columns which should be in INCLUDE ( i found it form OUTPUT LIST) are,

[Time].TaskUID,
[Time].StatusCode,
[Time].NegotiatedChargeAmt,
[Time].HomeNegChargeRegExtAmt,
[Time].OperNegChargeRegExtAmt.

As there are clustered and non clustered Indexes on table "Time" ( already created), hence can one create covering index
________________________
Non Clustered indexes on table "Time" are :-
__________________________
ActivityCode
BatchID
BillingNum
ProjectSiteURN
ProjectCode, TimeEntryDate
ResourceID, ResourceSiteURN,StatusCode, TimeEntryDate
StatusCode
______________________________
clustered, unique ON TimeID
______________________________

How can be covering index be created
and which Non Clustered indexes to be dropped after creating covering index?

Can Any one suggest

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-23 : 08:40:08

I created Nonclustered index with include columns ( as i got from output List),

CREATE NONCLUSTERED INDEX [IDX_Time_StatusCode_CI] ON [dbo].[Time]
(
[StatusCode] ASC
)
INCLUDE ( [TaskUID],
[NegotiatedChargeAmt],
[HomeNegChargeRegExtAmt],
[OperNegChargeRegExtAmt]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

Now I have NC on statuscode already created.
I guess I need to drop it?
PLz suggest am i going right?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-07-23 : 10:28:59

yep...you dont need to have multiple indexes on same column so you can drop the other one

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-24 : 07:32:53

Further I started to analyse the Execution plan and found that
one the query has Cost 38% with clustered index scan.

which shows Predicate as
"NOT [dbo].[Task].[TaskName] as [t].[TaskName] like N'Billable Expense%'"

I want to know what does the predicate indicate.

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-24 : 07:58:10

here column name = TaskName does not have any index.

As Predicate shows the where clause stament in query
I want to know What does predicate means here

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-24 : 08:35:51

I created Non-Clustered index on column=taskname.
Still the
there is no change in Index scan and cost is same 38%.

What can be done in order to make Index scan to index seek when we have Predicate having Like Operator.

Any suggestion would be helpful.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-07-24 : 09:41:32

quote:
Originally posted by Vishal_sql

I created Non-Clustered index on column=taskname.
Still the
there is no change in Index scan and cost is same 38%.

What can be done in order to make Index scan to index seek when we have Predicate having Like Operator.

Any suggestion would be helpful.

just creating an index on a WHERE clause column doesnt mean that it would be used. It depends on lots of other factors like selectivity. for example if selectivity is poor (ie lot of records with matching pattern) then optimiser will decide like best way is to traverse the clustered index which is what you're getting.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

komkrit
Yak Posting Veteran

60 Posts

Posted - 2012-07-24 : 10:37:35

I agree with "visakh16"
Index maybe useful, or may be not in some case.
It's depend on selectivity,

for example
If your query result consist of almost of your table records,
It worthy optimizer to choose "Table Scan" or "Clusterd Index Scan" instead. It is a best way to compute with a large record.

To Vishal_sql
Could you please give us example of your query, and also (if possilbe) cardinality of each column in where clause?

Best Regards
Komkrit

Currently DBA for worldwide hotel reservation.
Experience in the top banking over hundreds of instances

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-24 : 11:08:33

Here is the query and

UPDATE
@temp_Project
SET
EACRevenue = pbrs.EACRevenue
FROM
(select p.ProjectCode,
p.RevisionNum,
sum(p.EACRevenue) AS EACRevenue
FROM (Select a.ProjectCode,
a.RevisionNum,
ISNULL(a.NegotiatedChargeAmt,0) AS EACRevenue
from ProjectRuleBillingSchedule a, TaskRule tr, Task t
where a.ProjectCode = tr.ProjectCode
AND a.RevisionNum = tr.RevisionNum
AND a.TaskUID = tr.TaskUID
AND a.ProjectCode = t.ProjectCode
AND a.RevisionNum = t.RevisionNum
AND a.TaskUID = t.TaskUID
AND t.TaskName not like 'Billable Expense%'
AND tr.BillTypeCode = 'F')p
GROUP BY p.ProjectCode, p.RevisionNum) pbrs,
@temp_Project TP
WHERE TP.ProjectID = pbrs.ProjectCode
AND TP.RevisionNum = pbrs.RevisionNum

and cardinality of the taskname on Task table is

--select taskname from task -- 100348 records
--select distinct taskname from task --1964 records

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-07-24 : 11:40:03

hmm...table variable??
whats the amount of data you're dealing with in above case?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

komkrit
Yak Posting Veteran

60 Posts

Posted - 2012-07-24 : 12:37:15

Thank Vishal_sql for your statement.

It seem TaskName is not candidate for creating an index.
Because

"AND t.TaskName not like 'Billable Expense%'"
"--select taskname from task -- 100348 records"
"--select distinct taskname from task --1964 records"

May return over 90% of overall records.
You please first check by following query.

-----------
SELECT COUNT(*)
FROM Task
WHERE TaskName NOT LIKE 'Billable Expense%'
-----------

If result return nearly 100348 records (or even more than 5000 records), it is not a good candidate to be indexed.
You may look for another way. Such as

Examine the size of table @temp_Project, if it is not over 1000 records
it is worthy to look at selectivity of

@temp_Project.ProjectID
@temp_Project.TP.RevisionNum

If any of above have a good selectivity, so you need to create index on below acordingly.

ProjectRuleBillingSchedule.ProjectCode and/or Task.ProjectCode
ProjectRuleBillingSchedule.RevisionNum and or Task.RevisionNum

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Komkrit Yensirikul
Currently be a DBA in worldwide hotel reservation. Experienced in top leading banking environment.

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-25 : 02:53:39

Thanks Komkrit for your explaination,
But as u say if the query

select count(*) from task
where taskname not like 'Billable Expense%'

returns more than 5000 records ( in my case it returns 90k records)
I would go for projectrulebillingschedule's column projectcode and revisionnum.
But I have already NonClustered index on those columns (projectcode and revisionnum.) of projectrulebillingschedule and Task table.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-07-25 : 10:30:05

quote:
Originally posted by Vishal_sql

Thanks Komkrit for your explaination,
But as u say if the query

select count(*) from task
where taskname not like 'Billable Expense%'

returns more than 5000 records ( in my case it returns 90k records)
I would go for projectrulebillingschedule's column projectcode and revisionnum.

But I have already NonClustered index on those columns (projectcode and revisionnum.) of projectrulebillingschedule and Task table.

thats ok
you can still have a nonclustered index on those columns if you've to frequently serach based on them.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

komkrit
Yak Posting Veteran

60 Posts

Posted - 2012-07-25 : 13:19:34

Thank for result Vishal_sql

I want to make sure that indexes is affect improving performance.
Please run following queries to check selectivity of each column

select count(*) from @temp_Project

select count(*) from ProjectRuleBillingSchedule
select disctinct ProjectCode from ProjectRuleBillingSchedule
select disctinct RevisionNum from ProjectRuleBillingSchedule

select count(*) from TaskRule
select distinct ProjectCode from TaskRule
select distinct RevisionNum from TaskRule

select distinct ProjectCode from Task
select distinct RevisionNum from Task

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Komkrit Yensirikul
Currently be a DBA in worldwide hotel reservation. Experienced in top leading banking environment.

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-26 : 03:27:59

Hi Komkrit,
Thanks for your kind interest in my issue

Here is result for your queries

select count(*) from @temp_Project

Result of this query depends on the parameters but still here are results
(
-- records=1 ( if run for single project as parameter)
-- records = 313 ( if run for all group of projects as parameters)
-- But the execution Plan is same for 1 project and group ( eg:- index Scan
)

select count(*) from ProjectRuleBillingSchedule -- records = 3033
select COUNT(distinct ProjectCode) from ProjectRuleBillingSchedule -- records = 193
select COUNT(distinct RevisionNum) from ProjectRuleBillingSchedule -- records = 39

select count(*) from TaskRule -- records = 100347
select Count(distinct ProjectCode) from TaskRule -- records = 3480
select Count(distinct RevisionNum) from TaskRule -- records = 46

select count(*) from Task with (NOLOCK) -- records=100349
select count(distinct ProjectCode) from Task -- records = 3480
select count(distinct RevisionNum) from Task -- records = 46

FYI,

TaskRule has
Clustered index on [ProjectCode] ,[RevisionNum] ,[TaskUID]
and Non-Clustered Index on [BillTypeCode]

Task has
Clustered index on [ProjectCode] ,[RevisionNum] ,[TaskUID]
Non-Clustered index on [TaskSeqID] ,[ProjectCode] ,[RevisionNum],[TaskName]

komkrit
Yak Posting Veteran

60 Posts

Posted - 2012-07-26 : 14:33:48

Hello Vishal_sql,
Thank you for your result, and let's continue on your queries.

------------------------------------------------------------------
select count(*) from @temp_Project
-- records=1 ( if run for single project as parameter)
-- records = 313 ( if run for all group of projects as parameters)
------------------------------------------------------------------
Above information is usefull.
When @temp_Project records=1, it posibly to process data just a few records from predicate "TP.ProjectID = pbrs.ProjectCode AND TP.RevisionNum = pbrs.RevisionNum"

When @temp_Project = 313 records, it absolutely to run slower than a single record. In my test scenario, its execution costs is around 300 times a single record.

------------------------------------------------------------------
select count(*) from ProjectRuleBillingSchedule -- records = 3033
select COUNT(distinct ProjectCode) from ProjectRuleBillingSchedule -- records = 193
select COUNT(distinct RevisionNum) from ProjectRuleBillingSchedule -- records = 39
------------------------------------------------------------------
This part is the most performance improve on your query.
Because there are ProjectCode only 193 value in table ProjectRuleBillingSchedule.
When it join ProjectRuleBillingSchedule to table Task and TaskRule,
it filter out ProjectCode from both joined tables to have only 193 matching ProjectCode (only 10% of Task.ProjectCode matching in join predicate). So it reduce total cost to be not much.

--------------------------------------------------------------
select count(*) from TaskRule -- records = 100347
select Count(distinct ProjectCode) from TaskRule -- records = 3480
select Count(distinct RevisionNum) from TaskRule -- records = 46

select count(*) from Task with (NOLOCK) -- records=100349
select count(distinct ProjectCode) from Task -- records = 3480
select count(distinct RevisionNum) from Task -- records = 46
--------------------------------------------------------------
It appear that ProjectCode and RevisionNum are not a good candidate to be a single element in indexes because low cardinality.
But I just noticed that (if I wrong please correct me) Primary Key of these tables are (ProjectCode, RevisionNum, TaskUID) and they are already be clustered index. No need to do more, it's already good.

Please see my test scenario below, when @Temp_Project = 1 record.
Execution plan get Clustered Index Seek through these indexes.
When @Temp_Project = 313 records. It get mix of clustered index seek and clusterd index scan. But execution cost is not much becuase ProjectRuleBillingSchedule has only 193 ProjectCode.

The last
- Execution plan cost show 38% in the node, first please find node's subtree cost. If node subtree cost is not over 5, it's not too slow.
38% of a little cost is also said "little"
In my test scenario, index seek also take 30% of cost (but 30% of 0.001 subtree costs)
- If you still facing clustered index scan on record=1, then first check column's data-type in join predicate column must be the same.
For example

WHERE TP.ProjectID = pbrs.ProjectCode
AND TP.RevisionNum = pbrs.RevisionNum

Please check data-type of TP.ProjectID and pbrs.ProjectCode
If they are not the same data-type, resulting in implicit convert of data-type and index would not be consider to be used.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Komkrit Yensirikul
Currently be a DBA in worldwide hotel reservation. Experienced in top leading banking environment.

komkrit
Yak Posting Veteran

60 Posts

Posted - 2012-07-26 : 14:51:32

Test Scenario

To create a sample data, I ran following command to create a simple as possible table structure, but retain the key column.
I run script to generate the plain data as similar as to your data.

The test result got index seek and very fast for single record of @Temp_Project and also not bad performance for 313 records of @Temp_Project. Just create clustered indexes on (ProjectCode, RevisionNum, TaskUID)

---------------------------------------------------------------
CREATE TABLE [dbo].[Task](
[ProjectCode] [int] NOT NULL,
[RevisionNum] [int] NOT NULL,
[TaskUID] [int] NOT NULL,
[TaskName] [varchar](50) NULL,
CONSTRAINT [PK_Task] PRIMARY KEY CLUSTERED
(
[ProjectCode] ASC,
[RevisionNum] ASC,
[TaskUID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[TaskRule](
[ProjectCode] [int] NOT NULL,
[RevisionNum] [int] NOT NULL,
[TaskUID] [int] NOT NULL,
[TaskName] [varchar](50) NULL,
CONSTRAINT [PK_TaskRule] PRIMARY KEY CLUSTERED
(
[ProjectCode] ASC,
[RevisionNum] ASC,
[TaskUID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[ProjectRuleBillingSchedule](
[ProjectCode] [int] NOT NULL,
[RevisionNum] [int] NOT NULL,
[TaskUID] [int] NOT NULL,
[TaskName] [varchar](50) NULL,
[NegotiatedChargeAmt] [int] NULL,
CONSTRAINT [PK_ProjectRuleBillingSchedule] PRIMARY KEY CLUSTERED
(
[ProjectCode] ASC,
[RevisionNum] ASC,
[TaskUID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

--instead use of temp variable, I create simeple table for testing
CREATE TABLE [dbo].[Temp_Project](
[ProjectID] [bigint] NOT NULL,
[RevisionNum] [bigint] NOT NULL,
[EACRevenue] [int] NULL
) ON [PRIMARY]

-------------------------------------------------------
--generate simple data for Task table

DECLARE @ProjectCode int
DECLARE @RevisionNum int

SET @ProjectCode = 1
WHILE @ProjectCode <= 3480
BEGIN
SET @RevisionNum = 1
WHILE @RevisionNum <= 46
BEGIN
INSERT INTO dbo.Task
(
[ProjectCode]
,[RevisionNum]
,[TaskUID]
,[TaskName]
)
VALUES
(
@ProjectCode
,@RevisionNum
,1
,'Task' + CAST(@RevisionNum AS varchar(2))
)

SET @RevisionNum = @RevisionNum + 1
END

SET @ProjectCode = @ProjectCode + 1
END

--------------------------------------------------------
--generate data for TaskRule
INSERT INTO TaskRule
SELECT * FROM Task

--------------------------------------------------------
--generate simple data for ProjectRuleBillingSchedule table

DECLARE @ProjectCode int
DECLARE @RevisionNum int

SET @ProjectCode = 1
WHILE @ProjectCode <= 193
BEGIN
SET @RevisionNum = 1
WHILE @RevisionNum <= 39
BEGIN
INSERT INTO dbo.ProjectRuleBillingSchedule
(
[ProjectCode]
,[RevisionNum]
,[TaskUID]
,[TaskName]
,[NegotiatedChargeAmt]
)
VALUES
(
@ProjectCode
,@RevisionNum
,1
,'Task' + CAST(@RevisionNum AS varchar(2))
,10
)

SET @RevisionNum = @RevisionNum + 1
END

SET @ProjectCode = @ProjectCode + 1
END

--------------------------------------------------------------
--generate simple data for Temp_Project table

DECLARE @ProjectCode int
DECLARE @RevisionNum int

SET @ProjectCode = 1
WHILE @ProjectCode <= 100
BEGIN
SET @RevisionNum = 1
WHILE @RevisionNum <= 3
BEGIN
INSERT INTO dbo.Temp_Project
(
[ProjectID]
,[RevisionNum]
,[EACRevenue]
)
VALUES
(
@ProjectCode
,@RevisionNum
,10
)

SET @RevisionNum = @RevisionNum + 1
END

SET @ProjectCode = @ProjectCode + 1
END

---------------------------------------------------------------

finally, sample query

---------------------------

UPDATE Temp_Project
SET EACRevenue = pbrs.EACRevenue
FROM (
SELECT p.ProjectCode
,p.RevisionNum
,sum(p.EACRevenue) AS EACRevenue
FROM (

SELECT a.ProjectCode
,a.RevisionNum
,ISNULL(a.NegotiatedChargeAmt,0) AS EACRevenue
FROM ProjectRuleBillingSchedule a, TaskRule tr, Task t
WHERE a.ProjectCode = tr.ProjectCode
AND a.RevisionNum = tr.RevisionNum
AND a.TaskUID = tr.TaskUID
AND a.ProjectCode = t.ProjectCode
AND a.RevisionNum = t.RevisionNum
AND a.TaskUID = t.TaskUID
AND t.TaskName not like 'Task1%'
) p
GROUP BY p.ProjectCode, p.RevisionNum
) pbrs,Temp_Project TP
WHERE TP.ProjectID = pbrs.ProjectCode
AND TP.RevisionNum = pbrs.RevisionNum
/*unmark following line for single record of Temp_Project*/
--AND TP.ProjectID = 1
--AND TP.RevisionNum = 1

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Komkrit Yensirikul
Currently be a DBA in worldwide hotel reservation. Experienced in top leading banking environment.

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-27 : 03:22:00

quote:
Originally posted by Vishal_sql

The last
- Execution plan cost show 38% in the node, first please find node's subtree cost. If node subtree cost is not over 5, it's not too slow.
38% of a little cost is also said "little"
In my test scenario, index seek also take 30% of cost (but 30% of 0.001 subtree costs)
- If you still facing clustered index scan on record=1, then first check column's data-type in join predicate column must be the same.

Thanks Komkrit,

In my scenario Node's HASH MATCH Cost is 41% and subtree cost is 2.1857,
while Index scan on task table ( Predicate NOT LIKE 'Billable %' statement Costs 38% ( same for record=1 and group )

Hence when I checked the datatypes of Projectcode and ProjectID ,RevisionNum in @temp_project and pbrs table, it is same.i.e Nvarchar and int simultaneously.

I think can we do one thing:-
get the taskname from task table in #temp table and Join in the required query.
Wont it help in performance for like opertaor ?

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-27 : 05:34:21

eg:-

DECLARE @temp_taskname Table
(
projectcode nvarchar(21)
,RevisionNUM int
,taskuid int
,taskname nvarchar(25)

)

INSERT @temp_taskname
select distinct projectcode,(RevisionNUM),taskuid,taskname
from task where taskname like 'Billable%'

And then we can use the @temp_taskname
in our query

UPDATE
@temp_Project
SET
EACRevenue = pbrs.EACRevenue
FROM
(select p.ProjectCode,
p.RevisionNum,
sum(p.EACRevenue) AS EACRevenue
FROM (Select a.ProjectCode,
a.RevisionNum,
ISNULL(a.NegotiatedChargeAmt,0) AS EACRevenue
from ProjectRuleBillingSchedule a, TaskRule tr, Task t,@temp_taskname tn
where a.ProjectCode = tr.ProjectCode
AND a.RevisionNum = tr.RevisionNum
AND a.TaskUID = tr.TaskUID
AND a.ProjectCode = t.ProjectCode
AND a.RevisionNum = t.RevisionNum
AND a.TaskUID = t.TaskUID
~~AND t.TaskName not like 'Billable Expense%'~~
and t.projectcode <> tn.projectcode
and t.revisionnum <> tn.revisionnum
and t.taskUID <> tn.taskUID
AND tr.BillTypeCode = 'F')p
GROUP BY p.ProjectCode, p.RevisionNum) pbrs,
@temp_Project TP
WHERE TP.ProjectID = pbrs.ProjectCode
AND TP.RevisionNum = pbrs.RevisionNum

Will this be replaceable for Like and will it be faster or it takes more time for <> operator?

Vishal_sql
Posting Yak Master

102 Posts

Posted - 2012-07-27 : 06:35:53

nope,it wont be faster.
i checked at my end it will take more time.

komkrit
Yak Posting Veteran

60 Posts

Posted - 2012-07-27 : 11:12:56

Hello Vishal_sql,

It costs from aggregate function.
But cost =3 is not too bad for aggreggate function on hundreds thousands of records.
For plain and simple table, select sum() of 160000 records simple plain table use around 0.7
Also you have Joining and little more complex query, so cost=3 is ok for aggregate function.

In my test scenario, cost = 0.01 when record=1 and cost = 3 when record=300
It does not make sense to got index scan for record=1, but acutally maybe a difference data.
Could you please PM me a backup of sample database? So I could deep down into problem.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Komkrit Yensirikul
Currently be a DBA in worldwide hotel reservation. Experienced in top leading banking environment.

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources