Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
Nikhil341
Starting Member
9 Posts |
Posted - 2012-01-11 : 16:09:35
|
| Hi,Can you please help me how to write the percentile function in SQL?I have like 8 Web transactions and the response time for each one is 1,3,0.5,4,0.7,4,1 sec's.Now i need to calculate 85th percentile of the response time.Thanks. |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-01-12 : 09:41:41
|
I was trying to do this using the NTILE function (available in SQL 2005 and above), but I was unsuccessful. Maybe someone who is more familiar with it would be able to suggest a simple way of using that function.In the meantime, here is my long way of calculating the percentile. This would differ from what you would see if you were to do the same in Excel (and my humble opinion is that this, and not Excel is the correct way of doing it). For an interesting read on the nuances, see here: http://www.coventry.ac.uk/ec/~nhunt/pottel.pdfSo my long and convoluted way of calculating the percentile:CREATE TABLE #tmp(response_time DECIMAL(18,4));INSERT INTO #tmp VALUES (1),(3),(0.5),(4),(0.7),(4),(1);DECLARE @p DECIMAL(18,4) = 0.85;DECLARE @pN1 DECIMAL(18,4), @k1 DECIMAL(18,4), @kN DECIMAL(18,4);SELECT @pN1 = @p*(COUNT(*)+1), @k1 = MIN(response_time), @kn = MAX(response_time)FROM #tmp;WITH cte AS( SELECT *,ROW_NUMBER() OVER (ORDER BY response_time) AS RN FROM #tmp) SELECT ISNULL(MAX(CASE WHEN RN <= @pN1 THEN response_time END),@k1) + ( ISNULL(MIN(CASE WHEN RN > @pN1 THEN response_time END),@kn) - ISNULL(MAX(CASE WHEN RN <= @pN1 THEN response_time END),@k1) )*(@pn1-FLOOR(@pn1))FROM cte; |
 |
|
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2012-01-12 : 10:46:44
|
| I suspect peso will have an algorithm somewhere for this. He always does.Charlie===============================================================Msg 3903, Level 16, State 1, Line 1736The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2012-01-12 : 10:49:35
|
| In fact, as you calculate the percentile based only on the preceding rows, I *know* he has a method that can be applied.However, I still haven't quite digested it to the point where I can use it.Charlie===============================================================Msg 3903, Level 16, State 1, Line 1736The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
|
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2012-01-12 : 11:11:17
|
| he is here:@SwePeso N 56°04'39.26 E 12°55'05.63according to his twitter feed.Charlie===============================================================Msg 3903, Level 16, State 1, Line 1736The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2012-01-12 : 12:09:58
|
Here's a sample dataset which we could play with:IF OBJECT_ID('tempdb..#PercentileData') IS NOT NULL DROP TABLE #PercentileDataGOCREATE TABLE #PercentileData ( [SeriesID] INT , [Value] DECIMAL(9,2) )GO-- Populate PercentileData 65536 row(s); WITH series ([n])AS ( SELECT ROW_NUMBER() OVER ( ORDER BY a.[n] ) FROM (VALUES (1),(1),(1), (1)) AS a (n) CROSS JOIN (VALUES (1),(1),(1), (1)) AS b (n) CROSS JOIN (VALUES (1),(1),(1), (1)) AS c (n) CROSS JOIN (VALUES (1),(1),(1), (1)) AS d (n) )INSERT #PercentileData ([SeriesID], [Value])SELECT s.[n] , vals.[val]FROM series AS s CROSS APPLY ( SELECT ABS(CAST(HASHBYTES('MD2', CAST(s.[n] AS VARBINARY(16))) AS TINYINT) / CAST(s.[n] AS FLOAT) * (n.[n])) FROM series AS n ) AS vals ([val]) SELECT * FROM #PercentileDataCharlie===============================================================Msg 3903, Level 16, State 1, Line 1736The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
|
TG
Master Smack Fu Yak Hacker
6065 Posts |
Posted - 2012-01-12 : 12:20:30
|
I got a different answer so it is probably wrong )this is using sunitabeck's example table with the OP's data:CREATE TABLE #tmp(response_time DECIMAL(18,4));INSERT INTO #tmp VALUES (1),(3),(0.5),(4),(0.7),(4),(1);select max(response_time)from (--assign an order to all the response times select response_time, ROW_NUMBER() over (order by response_time) rn from #tmp ) dwhere rn < (--find the first row number below which 85% of the values fall) select case when ceiling(COUNT(*) * .85) = COUNT(*) * .85 then ceiling(COUNT(*) *.85) + 1 else CEILING(count(*) * .85) end from #tmp )OUTPUT:3.0000 Be One with the OptimizerTG |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-01-12 : 13:15:43
|
The differences in our calculations may be due to:a) the choice of whether to report the value from the samples that is closest to the percentile or whether to interpolateb) the small sample size.If you look at the samples and the distribution, it looks like this:response_time Rank PercentileLocation0.5000 1 0.0000000000000.7000 2 16.6666666666661.0000 3 33.3333333333331.0000 4 50.0000000000003.0000 5 66.6666666666664.0000 6 83.3333333333334.0000 7 100.000000000000 So it would seem like the 85th percentile is 4.0 whether or not you interpolate.-- code used to generate the data aboveCREATE TABLE #tmp(response_time DECIMAL(18,4));INSERT INTO #tmp VALUES (1),(3),(0.5),(4),(0.7),(4),(1);SELECT response_time, ROW_NUMBER() OVER(ORDER BY response_time) AS [Rank], 100.0*(ROW_NUMBER() OVER(ORDER BY response_time)-1)/(COUNT(*) OVER()-1) AS PercentileLocationFROM #tmp; DROP TABLE #tmp; |
 |
|
|
|
|
|
|
|