| Author |
Topic |
|
ThreePea
Yak Posting Veteran
83 Posts |
Posted - 2002-06-21 : 11:36:32
|
Hey everyone, I am analyzing our indexes and trying to weed out the bad ones. I have been studying selectivity all morning and have read everything I can get my hands on. But I still have three nagging questions and I hope one of the gurus here can help. Here is my output from DBCC SHOW_STATISTICS (container, active): Rows Rows Sampled Steps Density-------------------- -------------------- ------ -------141254 141254 11 0.0 All density Average Length Columns------------------------ ---------------- --------------------9.0909094E-2 4.0 Customer_No5.5555556E-2 6.0 Customer_No, ActiveRANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS------------ ----------- ----------- -------------------- --------------6 0.0 1679.0 0 0.018 0.0 1674.0 0 0.0110 0.0 57637.0 0 0.09752 0.0 236.0 0 0.019965 0.0 49103.0 0 0.019966 0.0 1612.0 0 0.020110 0.0 8.0 0 0.020210 0.0 139.0 0 0.020230 0.0 7.0 0 0.020743 0.0 57.0 0 0.021132 0.0 29102.0 0 0.0 1. Why is the Density 0.0? My reading stated that a density of .1 or less is highly selective. A unique index has a selectivity of .000007079 (1/141254), which should be the lowest density possible on this table. So 0.0 doesn't make sense.2. The equation for calculating density, according to "Inside SQL Server 7.0" is (nonfrequent count / distinct nonfrequent count) / (number of rows). "nonfrequent count" is the number Customer_No's that appear only once in the histogram steps. It looks like this number would be 11. And of course the number of rows is 141254. But what does "distinct nonfrequent count" mean, and how do I calculate it?3. Lastly, I didn't show it here, but the result also shows a histogram for the active column as well. I was under the impression from what I read that histogram data was only stored for the first component of the index?Thanks so much.========================================Tolerance is the last virtue of an immoral society. |
|
|
colinm
Yak Posting Veteran
62 Posts |
Posted - 2002-06-21 : 11:46:32
|
| 1) Could be that it is simply being rounded off to 0? |
 |
|
|
rkc01
Starting Member
43 Posts |
Posted - 2002-06-21 : 13:12:29
|
| Look at the discussion in the general forum for 'statistics on composite indexes' posted on 06/05. This may be of help. |
 |
|
|
ThreePea
Yak Posting Veteran
83 Posts |
Posted - 2002-06-21 : 13:31:39
|
quote: 1) Could be that it is simply being rounded off to 0?
No, I don't think so. The column uses exponential notation, and I've seen some pretty low numbers in that column for other tables, so I don't think it is rounding. quote: Look at the discussion in the general forum for 'statistics on composite indexes' posted on 06/05. This may be of help.
Thanks, rkc01, but I already did. It was a good discussion, but did not answer the questions I have here. The articles that it referred to I had already devoured.However, in studying the results, I think my question #3 answers question #1. It would appear that SQL Server will finally determine that the 1st column isn't selective (hence the density of 0), and will start keeping statistics on the 2nd column of the index.That just leaves question #2, which I haven't been able to figure out yet.========================================Tolerance is the last virtue of an immoral society. |
 |
|
|
izaltsman
A custom title
1139 Posts |
Posted - 2002-06-21 : 14:02:34
|
quote: That just leaves question #2, which I haven't been able to figure out yet.
I wasn't able to find the equation you are using anywhere... AFAIK, density is calculated as 1/Range_Rows. Since your Range_Rows value appears to be 0 throughout, SQL Server is getting "divide by 0" error, and reports the result of the calculation back as 0. |
 |
|
|
ThreePea
Yak Posting Veteran
83 Posts |
Posted - 2002-06-21 : 14:32:53
|
quote: I wasn't able to find the equation you are using anywhere... AFAIK, density is calculated as 1/Range_Rows
The Density value for the histogram, and the All_Density values for the composite columns, are calculated with different, though similar, algorithms:density = (nf count / distinct nf count) / (# of rows)all_density = 1 / cardinality of index keysI got those calculations from Inside Microsoft SQL Server 7.0 by Ron Soukup and Kalen Delaney, pg 813. Those calculations are repeated here at [url]http://www.sqlmag.com/Articles/Index.cfm?ArticleID=5660[/url]I know I've only been studying this for a day, but I still think my earlier thought is correct - SQL Server has decided that the first component is useless, and has moved on to keeping statistics on the 2nd index component. I base this on a couple observations: Like you said, the Density and the Range_Rows are all 0. But secondly, on every index where this is true, SHOW_STATISTICS also returns a Density and Histogram with proper values for the 2nd column component of the index.========================================Tolerance is the last virtue of an immoral society. |
 |
|
|
JohnDeere
Posting Yak Master
191 Posts |
Posted - 2002-07-25 : 01:30:40
|
| I noticed the output was from sql 2000. Did the calculations changein 2000?LHarra"They that can give up essential liberty to obtain a little temporary safetydeserve neither liberty nor safety."(Benjamin Franklin) |
 |
|
|
JohnDeere
Posting Yak Master
191 Posts |
Posted - 2002-07-25 : 09:38:53
|
| Found this on the MSDN siteA separate density value is also displayed in the first line of dbcc show_statistics, but this is not used by the optimizer in SQL Server 2000.http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/statquery.aspThat I am still not sure how the density is 0 but it appears sql 2000 optimizer does not use that information.LHarra"They that can give up essential liberty to obtain a little temporary safetydeserve neither liberty nor safety."(Benjamin Franklin) |
 |
|
|
ThreePea
Yak Posting Veteran
83 Posts |
Posted - 2002-07-25 : 09:45:48
|
| Thanks for your input, John. That is a big help.Jerry========================================Tolerance is the last virtue of an immoral society. |
 |
|
|
|