Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 Inhereting a shaky environment

Author  Topic 

Jim Beam
Posting Yak Master

137 Posts

Posted - 2010-09-17 : 12:46:00
Hi all,

Starting a week Tuesday I'll be taking control of a environment, I've described it as I understand it below, as well as my own proposals, and wanted some suggestions for tightening things up;

Here's a general description of the DB situation...
* About 10 servers, 100 dbs
* A few hundred stored procedures, user-defined functions, triggers
* The DB is primarily accessed by a dozen or so vb.net applications that were created in-house.
* No Dev system, testing apparently is being done on the LIVE systems leading to periodic slowdowns (yeah, hairy I know!),
* deletes/rollbacks occurring too often, and taking too long (ie on huge tables they would delete in chunks instead of piping out the records they want to keep from T1 to t2, verifying the copy was successful, truncating t1 and re-importing from t2)

Here's what I intend to do, in rough order:

1) Verify all backups are actually present, and if not, correct this;
2) Use an MS tool to inventorise the entire SQL infrastructure and discuss results with Mgt;
3) Implement some sort of ongoing monitoring (over and above Activity Monitor and SPWho2) to figure out who is doing what to the db's;
4) Implement some sort of Windows grouping system to place the Dev users into, and allocate permissions accordingly (hopefully this will also help tracking).

All suggestions/questions warmly received.

Cheers,

Jim

dinakar
Master Smack Fu Yak Hacker

2507 Posts

Posted - 2010-09-17 : 12:56:02
(1) verify the backup strategy - how frequently are full backups done, tlog backups, differentials etc.
(2) have a separate Dev/Test systems..
(3) how frequently is the reindexing done ...


Dinakar Nethi
************************
Life is short. Enjoy it.
************************
http://weblogs.sqlteam.com/dinakar/
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-09-17 : 13:14:51
on Day One hit the big red switch to take everything down to see how bad (or good) their disaster recovery is. You don't want them blaming you for that on day two when you haven't had a chance to improve anything (assuming you still have a job on Day Two )

(Allegedly true story as done by the incoming IT Director at Barclaycard Visa data centre at 16:49 on Friday afternoon - took them the whole weekend to sort the mess out ... he kept his job apparently ...)

Hmmm ... maybe just threatening to do it would get everyone to Fess up with all the things they KNOW wouldn't survive that exercise.

I'd stick in a script to log Database size / disk usage etc. very early on - so it runs for enough time, whilst you are busy with everything else, that you then have some useful prediction data for what's growing quickly when you are ready to interpret it.

One of the first things I would do is look at the size of Tlog backups in case there are any that are large, relative to that database's backups at other times, especially if the pattern repeats at time-of-day / day-of-week, as that will point to some transaction hog that may all-by-itself be bloating the LDF file - e.g. index rebuilds.

IIRC Tara's Blog (here on SQL Team) has a script that will log DB and Disk sizes
Go to Top of Page

Jim Beam
Posting Yak Master

137 Posts

Posted - 2010-09-17 : 13:46:59
Kristen, quite hilarious about the Big Red Button!!

On a general note though, if you had complaints about performance on a particular DB, how would you go about ID'ing what processes were at fault, and who was running them? I think it's a mix of SP_who2 and Profiler, but I'm a little rusty (as all my PREVIOUS systems were perfectly well indexed, ya know!!)

:)
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-09-18 : 02:27:09
As shipped Profiler has a "Long Running Query" template. However, some queries will run for long duration - but they may only run once a day, and frequently running processes that are quick, but not quick enough, will also be an issue.

We log the average run time of SProcs over time and use that as a base to detect any that have become slow, then look for whether something has broken them (in the old days they failed to recompile when Stats changed in SQL 2000, for example, but it could also mean that Stats update has failed, or data has grown and now an index is needed that was never apparent before).

So, my pitch would be:

Slow running queries that are not expected to be slow (and are reasonably frequent)
Very frequently running queries that are fast, but maybe "a bit too slow"
Queries that are slow relative to their long term baseline.
Go to Top of Page

Jim Beam
Posting Yak Master

137 Posts

Posted - 2010-09-18 : 05:07:19
Dinakar - many thanks for the reminder about reindexing frequency, I am betting it is NOT done more often than weekly, or that they update their stats!!

quote:
Originally posted by Kristen

As shipped Profiler has a "Long Running Query" template. However, some queries will run for long duration - but they may only run once a day, and frequently running processes that are quick, but not quick enough, will also be an issue.

We log the average run time of SProcs over time and use that as a base to detect any that have become slow, then look for whether something has broken them (in the old days they failed to recompile when Stats changed in SQL 2000, for example, but it could also mean that Stats update has failed, or data has grown and now an index is needed that was never apparent before).

So, my pitch would be:

Slow running queries that are not expected to be slow (and are reasonably frequent)
Very frequently running queries that are fast, but maybe "a bit too slow"
Queries that are slow relative to their long term baseline.



Ahhhh, I should have been clearer, their systems run fine for the MOST part, but as per your "Day 2" scenario above, when everything suddenly slows down at 1pm while the ex-Senior DBA is out to (liquid) lunch, I don't want to be questioned as to faultfinding without knowing what to look at/for (you've got a link to Perfmon counters and acceptable values for them in another subforum, correct?) a lot of my first week is going to be front end stuff but I don't trust fate not to throw me a curve ball on this!

Secondly, I do like your idea of establishing baselines for Sproc performance and logging run times. That'll definitely be on my 'To do' list.

Well...now I'm off to the 'Newbie TSQL" section, no doubt I need to catch up on (re)writing sprocs that use 1000 joins, 100 views and were being butchered by 10 Devs before I came along, tasked with fixing each in 1 minute flat....
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-09-19 : 02:13:59
Having a Base line figure for the SProc average runtime would help when the Senior DBA is out to lunch. IMHO that's the first thing to do because you can only get a Base Line when the system is healthy, not when it is sick! Note that when you have slowdown then everything will be slower than base line, so you need to allow for that, pro rata.

To un-butcher a poorly written Sproc I would stick commented-out SHOW PLAN and STATISTICS and then comment in one, or other, and run to see what it looks like. I figure that, in general, if I reduce the Scans and logical I/Os that is likely to be a reasonable fix. Obviously where necessary you can get more sophisticated than that, but with experience you can tell whether the Scan Count or Logical I.O is "wild" for the complexity of the query.

Here's the snippet I use:


-- Clear cache (for level playing field
-- - only if wanting to check PHYSICAL performance rather than LOGICAL performance)
-- Do **NOT** run these on a Live server
/**
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
**/



-- Comment in the SHOWPLAN to see the Query Plan, OR the STATISTICS to see the logical stats

-- SET STATISTICS PROFILE ON; SET SHOWPLAN_TEXT ON -- If Temp Tables involved
-- SET SHOWPLAN_TEXT ON
GO
-- SET STATISTICS IO ON; SET STATISTICS TIME ON


... put query here ...


SET STATISTICS IO OFF; SET STATISTICS TIME OFF
GO
SET SHOWPLAN_TEXT OFF
GO

If you are repeatedly rerunning an Sproc to test its performance you may want to put it in a transaction block so it can't change anything - I tend to do this routinely, even if I know it only has SELECT in it, as its a good habit to get into
Go to Top of Page

Jim Beam
Posting Yak Master

137 Posts

Posted - 2010-09-19 : 06:38:02
Cheers Kristen, just to keep things in the same thread, do you know where I can get hold of the MS software, or a freebie that will allow me to do a full (or fullest available) inventory of the SQL environment (server/db settings, sizes etc) ? Or is there a script here that I can run per server (or in my case, crime scene) ?
Go to Top of Page

BravehearT1326
Starting Member

24 Posts

Posted - 2010-09-20 : 04:18:53
Hi - you dont mention what verson of SQL this is going to be running on but remember in SQL 205 there are DMV's that can help pinpoint issues with the SQL instances including missing indexes, I/O performance etc.

In my past I too have inherited an environment which itself was unstable. First things I did were:

1. BACKUPS - ensure they are in place and running successfully and are the right type. Found out the backups were taking 26Hrs to complete so definately the first thing to solve there.
2. Document whats already there. This gives you your baseline to work from and to show how your recommendations will improve things.
3. Maintenance plans - see if any index rebuilds / update stats are in place especially if so called "fast queries" are taking longer than expected to execute.
4. Database growth rates - make sure these are set correctly. Had a large db that was set to grow by 1MB - so it spent most of its time extending when data was being added. The Management Studio reports on the db usage in SQL 2005 SP2 will show you when the db is extending and by how much.
5. Security - tighten it up if it isn't already. Too many cooks spoil your databases!!!
6. Patching - how up-to-date are the instances of SQL and are there any CRITICAL patches missing?

Thats just the start.....but I'm sure it'll keep you going for a few months as you amend and tweak your environment and all the best sorting it out.


Hope this
Go to Top of Page

dinakar
Master Smack Fu Yak Hacker

2507 Posts

Posted - 2010-09-20 : 18:03:51
check out the MAP tool thats free.. it will give you the SQL inventory with all the info you need..

Dinakar Nethi
************************
Life is short. Enjoy it.
************************
http://weblogs.sqlteam.com/dinakar/
Go to Top of Page
   

- Advertisement -