Disk Queue Length issues - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

SQL Server Administration (2000)

Disk Queue Length issues

Author

Topic

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-01 : 05:16:53

Guys,

Can I get a list of all the things we should look at to
resolve this? IO subsystem is the most obvious - my
question is, does anyone have knowledge of
environments /situation where a perfectly good IO sub-
system get High queue lengths because of bad app design,
bad DB design, SQL installation issues, anything ?

Or is it definitely only the IO sub-system.

We are not on local drives - we have a HBa connected, and
see queue lengths that peak over 300 (heighest of 850), and average over several days at 9!

TIA

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2003-07-01 : 12:37:13

Have you looked at the information on sql-server-performance.com?

Tara

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-02 : 08:25:21

I know that a lot of this is hardware/OS related. What I need to chase down is whether (a) SQL could be exacerbating the problem ; (b) what we could do from SQL's point of view to help reduce issues. SQL coding, DB set up, SQL Server config, etc. etc.

Some constraints - we can't change DB, or split on multiple filegroups , any time soon (the DB and design and App is 3rd party). We currently have 2 filegroups, drimary (65 gb), and document s(105 GB). Logs are on a different drive, as indicated below.

Some further Info:

4 different servers, connected to different storage arrays.

Serv A (prod data, 8x1.4 Ghz,4gb ram)
Serv b (prod read only4x550 Mhx,4gb ram)
Serv c (pre prod env 8x1.4 Ghz, 4Gb ram)
Serv d (prod sized test,4x500 Mhz, 2gb ram )

Serv A E:\ (345 gb data drive) connected via San to a Hitachi Storage array that has a 4 GB cache.
Serv A F:\ (45 log drive) connected via San to a Hitachi Storage array that has a 4 GB cache.
Serv A G:\ (345 backup drive) connected via San to a EMC Storage array that has a 8 GB cache.

Serv B E:\ (517 gb data, log & backups) directly connected to EMC array that has 8gb cache.

Serv C e:\ (345 GB data) connected via San to a EMC Storage array that has a 8 GB cache.
Serv C f:\ (35 GB log) connected via San to a EMC Storage array that has a 8 GB cache.
Serv C g:\ (345 GB backup) connected via San to a EMC Storage array that has a 8 GB cache.

Serv D e:\ (522 Gb data, lgo abd backups) directly connected to EMC array that has 8gb cache.

Scenario:

3 servers (a,b,c) on Win 2K SP3, SQL 2K sp2 + slammer ( we can't go above MDac 2.6). Serv D on WinNt 4, SP5, SQL 7 SP 2.

We run one part of the application - basically a program that hands out max numbers to requesting clients (called Counter Server) starts up. NO clients are connected. Nothing else is using SQL or the box. The counter Server startup runs for several minutes.

Serv A - Current Queue lengths spikes to 250, and averages around 50 to 80.
Serv B - Current Queue lengths no spikes, and averages around 5.
Serv C - Current Queue lengths jump to 80, and averages around 50.
Serv D - Current Queue lengths no spikes, and averages around 5.

Obviously, we are suspicious of the whole SAN connection - we have EMC tech's coming in.

We have been reported that for 1h30 during core hours yesterday, the Hitachi array had an 11% cache hit ratio, where the recoomend 80% !!!
During these periods, we see SQL baffer cache hits around 99,6%.

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2003-07-02 : 09:16:11

I have some questions:

1. Are you actually seeing poor performance, or are you just concerned with the numbers you are seeing in perfmon?
2. What model of hitachi and EMC SAN are you using?
3. Do you see these problems on both SANs?
4. Do you see the long queues on shared disk or on local disks?
5. Are you running Entprise edition of SQL 2K? Advanced or Datacenter edition of Windows?
6. On the systems with 4GB of RAM, are you using the /3GB or /PAE switches in the boot.ini?

-ec

Edited by - eyechart on 07/02/2003 09:23:11

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-02 : 09:41:11

Some answers:

Q1. Are you actually seeing poor performance, or are you just concerned with the numbers you are seeing in perfmon?

We are VERY DEFINITELY seeing bad performance.We are having periods where users (over 250 of them in the call centre) can only log in 4 at a time, and it take 5 to 20 MINUTEs, if they don't get timed out. We are seeing Timeouts and performance pains across the board when we see the IO Queues start spiking and averaging high Queue lengths.

Q2. What model of hitachi and EMC SAN are you using?

Hitachi = Lightning 9500V
EMC = 8730 and 3830
(answers as per out Infrastructure vendor)

Q3. Do you see these problems on both SANs?

Yes - when we have "San connected" and not "direct" (again, that is how it as described to us). We have some EMC guys gathering stats for the next day & 1/2 regards cache nit ratio's, etc. etc.- They 11% cache hit ratio on the Hitachi was from some management software already in place, there.

4. Do you see the long queues on shared disk or on local disks?

Only see long queue lengths on DATA dirves. Refer to my scenario sketch. Serv A has e: (data 345 GB) - we see huge spikes (highest queue length I have seen was 850! For Serv A, the F: (45 gb, logs) has not spiked over 2, and seems to aveage 0.5 or so... We have seen no issues with queue on local dirves (which as are only used for C: (OS) and d: (application software) .

5. Are you running Entprise edition of SQL 2K? Advanced or Datacenter edition of Windows?

Enterprise Eition SQL, Advanced Server for Windows.

6. On the systems with 4GB of RAM, are you using the /3GB or /PAE switches in the boot.ini?

Yes - SQL has 2.8 GB (it is set for dynamic 0 to 4906 Mb , but we noticed that it always stops at 2.8 gb)

I don't believe this has anything to do with SQL's memory - we have very nice looking Cache hit ratio's FOR SQL (> 99%). The problem is the 11% (and lower) cache hit ratio on the Storgae arrays. 11% means that 8 out of 9 IO requests are NOT being satisified !

Thanx for the Questions ... any thoughts, or more queries ?

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2003-07-02 : 10:10:42

quote:

Q3. Do you see these problems on both SANs?

Yes - when we have "San connected" and not "direct" (again, that is how it as described to us). We have some EMC guys gathering stats for the next day & 1/2 regards cache nit ratio's, etc. etc.- They 11% cache hit ratio on the Hitachi was from some management software already in place, there.

When you are SAN connected, what fiber channel switches do you use? Do you share the same FC switches with the Hitachi and EMC SAN?
-ec

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-02 : 10:51:17

From the drawings we seen, we have 2 seperate fibre channels switches. (what I've was calling "san connected")

The Server's B and D have direct connections to the array's.

The server's have a mix of LP8000, LP850 and LP9002L-E Fibre cards.

The worst IO queuing occurs on the serv a, e:\ (345 gb) connected on a LP8000 fibre card, via the fibre channel switch, using a dedicated port "on both sides", to the Hitachi array (4gb cache, where we saw the 11% hit ratio). I will try to find out what the fibre channel switches are , but we are told we can have a 2gb "bandwidth", that might throttle down to 1 GB.

The second worst IO queuing occurs on serv C e:\ (345 gb) connected on a LP9002L-E fibre card, which "shares" a port with 1 other minor system (I am told), via a seperate Fibre channel switch, to an EMC array (not certain if it is the 8730 or the 3830).

From the drawing presented to us, we have 2 fibre channel switches. One of those seems to attach to the Hitachi and an EMC array (since out Serv A has E: and F: on Hitachi, but G: on EMC). The other EMC seems to be handled via a seperate fibre channel switch.

-- You'll note I'm not too familiar with these pieces of hardware, so I am getting some of this info from other people.

CiaO

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

Edited by - wanderer on 07/02/2003 11:14:21

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2003-07-02 : 11:13:54

I would check the switches for errors in their logs. typically the switches are connected to your data network via TCP/IP and you can connect to them via internet explorer to look at this. A login is usually also required.

It also sounds as if you have several generations of SAN hardware here. The hitachi thunder 9500 is the newest (and is probably capable of 2GB FC connections). However, the EMC equipment you have may not be able to connect at the 2GB speeds (unless they have been upgraded to do so, not sure if that is an option though).

I would do the following:

1. Gather detailed info regarding HBAs and their firmware/driver levels
2. Get information on the make and model of your FC switches as well as their firmware revision. Also look for errors in the error logs.
3. Do the same with your SANs, find out their software release and if they are logging errors.
4. Present this information to both Hitachi and EMC to determine if their is a hardware compatibility problem. Both of these companies maintain detailed hardware compatibility matrices (OS, HBAs, FC Switches, etc.) If you have something that is not verified compatibile, that could be your problem.
5. Have EMC and Hitachi verify the config of your switches.

-ec

Edited by - eyechart on 07/02/2003 11:22:31

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-02 : 11:21:55

Thanx EC.

We had EMC guys in today, setting up to capture in-depth information regarding our environment and the performance. I will be in a meeting on Friday with them and will pose some of these questions, and ask specifically about the error logs. We can get a Hitachi guy in as well, but he would be an out-of-towner, and hence has more cost and availability issues.

While we are a biggish environment, this is nowhere near top-end - we don't have a TerraByte DB, and we have a total of 450 users (max), creating up to 6 000 connections to the DB, running for an online day of between 12 and 13 hours - it's not likely we have 1000's of users in a 24x7 env.

Anyway, than for all your fedback, I look forward to trying to get some of these answers.

Has anyone got thoughts on what I could do from SQL, or do you tacitly agree (given the info I've posted, naturally) that it is unlikely to be SQL, or anything we can influence from that side ... ( altho, if we shut down SQL, then the IO queues are gone ... *grin*)

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-03 : 02:52:20

The info given regards our fibre channel switches:

2 x EMC Connectrix ED-64M, one per centre. These are enterprise directors, not the entry level switches. Full line-speed switching, 200 MB/sec per port, full-duplex. We connect everything to these Connectrixes.

@EC - got any pointers regards those Hardware Compatibility matrices ? Otherwise, I'll hit the EMC and Hitachi sites, and start working from there...

Thx

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2003-07-03 : 11:31:12

the compatibility matrices are something that EMC and Hitachi support will have to provide to you.

Gather the information that I outlined a post or two before and send that to your support contacts at both EMC and Hitachi. They should be able to tell you if this equipment is on their HCL.

It is important that you also know the driver revision used for the HBAs and the firmware releases of all equipment (HBA, switch and SAN). YOu might be downlevel on your firmware in the switch (or other hardware) which is causing you your headaches.

-ec

Edited by - eyechart on 07/04/2003 15:03:24

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-04 : 02:26:25

Thanx EC . We should be seeing the EMC guys today, and they *shuold* have some info, as they've been reviewing the last 1 1/2 days of activity.

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2003-07-08 : 21:45:55

quote:

Thanx EC . We should be seeing the EMC guys today, and they *shuold* have some info, as they've been reviewing the last 1 1/2 days of activity.

Any update on this issue?

bump

Edited by - eyechart on 07/16/2003 22:36:53

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-10 : 04:17:18

Well,

We were supposed to have a meeting on Friday last week, then Monday, Tuesday ... It is now thursday, and we are no closer. Still awaiting the meeting.

*sigh*

This is what you get when a corporate start outsourcing infrastructure here, services there, programming somewhere else, and still does some IT itself...

I will update when I can ...

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

eyechart
Master Smack Fu Yak Hacker

3575 Posts

Posted - 2003-07-26 : 02:01:03

I'm still curious about this one. Any updates?

shsmonteiro
Constraint Violating Yak Guru

290 Posts

Posted - 2003-07-26 : 21:11:54

Hi,

Just to keep the thread, could you please check some counters from SQL Server? Like Forwarded Rec/sec, full Scans/sec page splits/sec scan points revalidations/sec, etc. I have little experience with EMC and Hittachi (I wrok all my time with IBM ESS), but a do remember that you are responsible to configure the RAID level on EMC (I may be wrong), as ell as the disk distribution. What RAID level are you using? Recent benchmarks shown a performance degradation of 50% comparing RAID 1+0 to RAID 5.

The disk showing the largest Disk Queue are for Logs os Data? Run the following whery to check the stalling stats for files:

 SELECT IoStallMS/(NumberReads+NumberWrites)As IoStall From ::fn_virtualfilestats (dbid, fileid)

Also, what is the Stripe unit size you use in your system? Large Stripe could help to reduce I/O. W2k suppport up to 64k stripe sizes.

Sérgio Monteiro

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-29 : 07:45:05

This one makes me want to cry, or kill !!!

[on soap box]
We still (4 weeks after they got the guys on site for EMC) have not had the meeting. To be quite honest, I don't understand how our client can let this vendor get away with the kind of lax display they are giving, as in this case. We are talking about a mission critical system to the client, yet it is 3 months since we first saw these queue length issues, and a month since we were told they were getting guys on site, and yet NOTHING has been done. AAAAArrrgh

[/off soap box]

Sigh

regards the questions:

Turns out our Hitachi is RAID 5. This drive was suppossed to be our backup drive, but they had issues on the implementation weekend, and ended up not having enough time to "swop" the drive around. So we have our production database (160 gb) running from the RAID 5 hitachi "drive", and our backups writing to an SRDF (they tell me that = hardware striping, and mirroring, thus Raid 1+0).

They didn't give me an answer regards stipe size - if striping is done by hardware, is this configurable ?

I have found some other vaguely related threads ... on SQL-Server-Performance guys were talking about LUNs and their impact , to quote

"Thanks SQLJunkie! Your reply corroborates what we've learned from Microsoft and EMC. We've spent the last few days digging into this. We've spoken to Microsoft and EMC. Both have said that multiple luns on a raid group will degrade performance, because of disk thrashing. According to one MS Support tech, he believes that this can degrade performance up to 60% on some systems. " the link is :
http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=951

{{I hope x-posting to anoth forum isn't verbotten !! }}

We have multiple LUN's in our environment, that are then grouped into metavolumes, to cater for our drive size ...

Comments, thought's, miracle's ???

*#&#* *#&#* *#&#* *#&#*

Chaos, Disorder and Panic ... my work is done here!

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-07-29 : 08:26:26

quote:
Originally posted by shsmonteiro

Hi,

Just to keep the thread, could you please check some counters from SQL Server? Like Forwarded Rec/sec, full Scans/sec page splits/sec scan points revalidations/sec, etc. I have little experience with EMC and Hittachi (I wrok all my time with IBM ESS), but a do remember that you are responsible to configure the RAID level on EMC (I may be wrong), as ell as the disk distribution. What RAID level are you using? Recent benchmarks shown a performance degradation of 50% comparing RAID 1+0 to RAID 5.

The disk showing the largest Disk Queue are for Logs os Data? Run the following whery to check the stalling stats for files:

SELECT IoStallMS/(NumberReads+NumberWrites)As IoStall From ::fn_virtualfilestats (dbid, fileid)

Also, what is the Stripe unit size you use in your system? Large Stripe could help to reduce I/O. W2k suppport up to 64k stripe sizes.

Sérgio Monteiro

Thanks for the interest Sergio ...

As to your other points: I am reviewing the IOStall SQL... Can you give me indication of what numbers we can expect here ? IOStall and Stall didn't bear much fruit in the BOL search I ran. I found this on the function : Total amount of time, in milliseconds, that users waited for the I/Os to complete on the file

I don't have a feel for a number here - is anything greater than 1 an issue ?
EDIT :

Just ran on one of our test environment, got IO queue length spikes, (this was just a random query to force bad queue length, as we don't have a captured, scripted version of the prod transaction mix to be able to realistically re-create the IO queue spikes).

IO stalls : 0
full scans/sec : max 8, ave 0.699
readahead pages : max 993, ave 38
buffer cache hit : 99.967%
pages splits : max 16.004, ave 0.294

The pages splits are interesting - all I did was a 3 table join. Nothing else running against the database.

*##* *##* *##* *##*

Chaos, Disorder and Panic ... my work is done here!

shsmonteiro
Constraint Violating Yak Guru

290 Posts

Posted - 2003-07-31 : 22:42:04

Hi, if the disks you're seen large disk queues are logs, than IoStall should be lower tha 20. I use the same threshold for logs and data.

About page splittng I can only think that this splitting could be on internal worktables used to hold temporary results for joins. I've noted that your server is doing a lot of readahead. That's not bad for database but can lead the IO subsystem to stress. This could indicate that your data is not fitting in the same page. You could try to change the stripe uitt size on EMC (I don't know how to do it on this storage, and if it can be done online.)

Sérgio Monteiro
Trust in no Oracle

Wanderer
Master Smack Fu Yak Hacker

1168 Posts

Posted - 2003-08-01 : 10:43:33

Just to confirm - The log drives are the F: on this server, and have never have had IO queue length issues. (always <2, average 0.02).

The issue is the data drive (e: - 345 GB, with a 160 GB database on it. Unfortunately, tempdb is also on the drive).

*##* *##* *##* *##*

Chaos, Disorder and Panic ... my work is done here!

shsmonteiro
Constraint Violating Yak Guru

290 Posts

Posted - 2003-08-02 : 16:53:11

Well,

I think you should get a diminution in your disk queue by moving tempdb to another disk set.

What about the forwarded rec/sec counter and scan revalidation/sec? You may be facing hot disk situation. A redistribution of data between disk sets would be good.

For page splits, playing with fillfactor wolud help too. But only if this activity is on real data file and not on tempdb.

Use the same function I sent before to get statistics on Writes and Reads. Check if the splitting is inside data files or tempdb.

Just for reflexion on these hard times: Why not use IBM ESS? It does not have disk contention or LUN problems. You dont even need to concern about the physical location of data inside clusters and stripe unit size is used acoording the maximum supported by OS. The nirvana.

Sérgio Monteiro
Trust in no Oracle

Next Page

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources