Author |
Topic |
Jamsan
Starting Member
7 Posts |
Posted - 2010-11-16 : 11:50:52
|
Hello All,I wanted to run the following scenario by everyone to ensure I wouldn't do myself a deservice by going forward with it. We currently have a SQL Server 2005 server with all MDF and LDF files on a single volume of our SAN (Equallogic iSCSI unit). We're slowly running out of disk space to the point we need to truncate LDF files to allow continued operation in the database. Budgets are a bit tight at the moment, so getting another unit in house is easier said than done. We're trying to think of alternate ways around our problem without spending additional funds right now.We have a 1+ TB set of local drives available un-used on the SQL server that only house text log files of jobs and backup files themselves (using LiteSpeed for compressed backups). We're thinking of moving the LDF files of the databases to this drive and keeping the MDF files on the SAN volume. The local array is RAID 5, which I know isn't ideal for performance, but it will at least separate the MDF/LDF activity to separate RAID controllers, and the local array should be the same performance as the iSCSI array, if not slightly faster. Is it unwise to move the LDF files off the SAN? Would there be any issues with recovery of backup files if necessary, transaction log restores, etc? We currently do not have any replication in place either at a SQL level or SAN level.Again, this is more so of an interim solution - we'll be looking to upgrade to SQL Server 2008 soon (sometime in 2011) on new hardware, and purchase additional SAN space as well.Thanks. |
|
Kristen
Test
22859 Posts |
Posted - 2010-11-16 : 13:59:25
|
I would have thought moving the LDF onto a different "spindle" to the MDF would make a significant difference.Ideally the disks for LDF would be optimised for sequential access, but in practice just not having to move the disk heads between LDF and MDF files will help.I don't think that the backups are a problem. Make sure that the Full / Differential backups go to a different disk-spindle (i.e. no the MDF disk-spindle), and likewise the Log backups go to a different disk-spindle to the LDF files.If you have MDF + Log backups on a single disk-spindle and/or LDF + Full backups on the same disk-spindle you are increasing the risk that you will have to restore to a full-backup, rather than the latest Log backup on disk. e.g. if the MDF + Log Backup disk fails you then have no database, and no recent log backups.If the database becomes corrupted, but you are still able to take a "tail" backup, then the odds are good you will be able to restore a Full backup, all subsequent Log backups, including the final "Tail" log backup, and thus have zero data loss. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
Jamsan
Starting Member
7 Posts |
Posted - 2010-11-16 : 14:34:07
|
Our DB's are in FULL recovery mode and we take hourly transaction log backups. In the event we need to "truncate" the logs to restore disk space, we take a full backup directly after truncating the logs to start the chain over again (in essence, making all previous transaction log backups useless).So as of now, we have the MDFs and LDFs on the same volume (SAN), and we have our full backups and transaction log backups on the same drive (local disk). Unfortunately, we only have one RAID controller in the server directly, so our only option to move the LDF's off the SAN would be to put them on the same volume as the full backups (also copied to tape) and the transaction log backups (also copied to tape). Is there an issue with this configuration other than potentially putting most of the proverbial eggs in one basket?What happens in the case of a local server failure and we need to bring up another SQL box? Will SQL automatically mount new LDF files if we mount the MDFs from SAN? |
|
|
Jamsan
Starting Member
7 Posts |
Posted - 2010-11-16 : 14:37:06
|
tkizer, to add to your post: When we perform transaction logs backups, should the physical LDF file shrink down or should it continue to grow? We use the built-in maintenance plan for tlog backups and here is the SQL that is generated from the plan:BACKUP LOG [DBNAME] TO DISK = N'F:\BACKUP\USER\DBNAME_backup_201011161435.TRN'WITH NOFORMAT, NOINIT, NAME = N'DBNAME_backup_20101116143553', SKIP, REWIND, NOUNLOAD, STATS = 10GO |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2010-11-16 : 14:40:55
|
You should not shrink the ldf down, practically ever. It should only ever be done manually and only ever because a one-off process went to shit and caused it to grow beyond what was needed. The LDF file should not continue to grow. It should at some point settle down.Perhaps hourly tlog backups aren't enough. We backup ours every 15 minutes.Could you give us some numbers so we can help you predict what size it should be? For instance, what's the size of the MDF and what's the size of the LDF? What's the size of the largest index? You only need to provide this for the databases where LDF growth is a problem or where you find yourself having to do truncate. In production, I NEVER truncate the transaction log unless disk space is at 0MB or close to it and the SAN drive can't be expanded quickly. There are other ways to fix the issue.Tara KizerMicrosoft MVP for Windows Server System - SQL Serverhttp://weblogs.sqlteam.com/tarad/Subscribe to my blog |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
Jamsan
Starting Member
7 Posts |
Posted - 2010-11-16 : 21:29:54
|
The only time they need to be truncated is when our weekend index rebuild job kicks in (again, using the built in maintenance package from SQL Server - maybe this is our problem). All other times the growth on the logs is stable and we never need to touch it. We rebuild indexes weekly and reorganize monthly - I've read we should potentially look into alternatives to the built in maintenance plans, and running them nightly on certain DBs only, and then other nights on other DBs, but I just honestly haven't had the time to dig around (this is just part of my job unfortunately).Our biggest database is around 100 GB or so. Biggest table being 13 GB with another 35 GB in indexes on it. Again, this is the biggest - we have 2 other larger DBs with the rest being smaller (less than 10 GB).On the restore front, we run restores from tape almost weekly (for one reason or another - not always SQL). We also restore the full backups we take nightly to another SQL Server (for reporting purposes), so that serves as a warm backup without replicating changes for the current days transactions. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
chadmat
The Chadinator
1974 Posts |
Posted - 2010-11-17 : 00:29:35
|
Or more proactively, check fragmentation and when between 10-30% do reorg and rebuild when >30%.-Chad |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-11-17 : 02:48:00
|
Also consider switching to bulk-logged recovery for the duration of the index rebuilds, if you can afford the minor associated risks.--Gail ShawSQL Server MVP |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-11-17 : 03:56:14
|
We increase TLog backup frequency to every 2 minutes during Index Maintenance. Index Maintenance is a huge Log-hog compared to our normal operations - such that it completely distorts the size of the LDF file we would have to carry otherwise.I think changing TLog backup interval from 1 hour to 15 minutes will help you - you'll have a bit more overhead because of the extra files, but that apart the physical amount of data backed up, per day, will be the same. You'll also get an improvement in your potential disaster recovery loss - if the Tlog file is available but the database is toast! you'll have max 15 minute loss, rather than 1 hour. And your LDF file will only have to store max 15 minutes transactions, rather than 60 minutes.As Tara and Chad say, we too rebuild Indexes based on fragmentation. We do this every night, on every database - but only the indexes that fragmentation over a set threshold get it - so its much more efficient than "rebuild the lot" - plus users get the benefit of indexes with high fragmentation being rebuilt "soon" rather than "at the weekend". Less strain on the Logs doing it little-and-often too. Depends a bit on if you have a quiet period overnight though. We don't Rebuild the indexes on our large indexes ever - they only get Defragged (REORGANISED) as our systems are 24/7/365 and we don't have Enterprise licence to allow ONLINE Reindex. (We do REBUILD them in scheduled maintenance periods, but that's pretty rare)"We also restore the full backups we take nightly to another SQL Server"Note that you can (and should!) run DBCC CHECKDB on the restored copy to check that the Backup and Restore process is sound. This will also mean that you do NOT need to run DBCC CHECKDB on the Production system - as any database corruptions will also be present after restore from FULL Backup - so you may be able to save some CPU & Disk cycles on the Production server but removing that from the Maintenance Tasks on the Production server. |
|
|
Jamsan
Starting Member
7 Posts |
Posted - 2010-11-17 : 09:17:19
|
Tara - Thanks for the links. I've gone through a bit and I'll definately go through them in more detail over the next few days and beging testing them in our dev environments.One item I'm confused on - Our TLog backup just ran ~10 minutes ago, but the size of the LDFs are still what they were prior to the backup. Are the physical files themselves supposed to shrink down, or is something else in play here? Our log files are set to autogrowth in 100 MB increments, unrestricted. Should this be set to a different value?Kristen - thanks for your suggestions as well. I'll definately look to shift the DBCC CHECKDB job to the reporting server. We currently run this process weekly on all DBs. Do you recommend shifting this process to a nightly schedule after the restores complete, or still keep it weekly but still shifting it to reporting? I see Tara recommends it running daily, so I wanted to confirm that. I'm always guessing we should monitor the results of the job more closely, as if the job does correct any issues on the reporting database, we would also want to run that job in production to correct the issues on the production DB.Thank you all for your suggestions and input thus far. This has been very helpful. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
Kristen
Test
22859 Posts |
Posted - 2010-11-17 : 13:13:46
|
"Are the physical files themselves supposed to shrink down, or is something else in play here?"The files remain the same size, the pages containing transactions, which are now backed up, are marked as available for reuse - so the file space gets reused. This is MUCH more efficient than repeatedly Shrinking and re-Extending the file - think of it as the file being pre-allocated, but able to extend itself if ever it becomes full."Do you recommend shifting this process to a nightly schedule after the restores complete, or still keep it weekly but still shifting it to reporting?"If you can easily do the DBCC CHeck each night then that is better - if you ever get a corrupt database the sooner you know about it the soon you can start fixing the problem - before it turns a crisis into a major disaster!"I'm always guessing we should monitor the results of the job more closely, as if the job does correct any issues on the reporting database, we would also want to run that job in production to correct the issues on the production DB."With the default settings it won't fix anything. I recommend that you don't use the automatic "fix" settings in the daily environment - it should only ever break if something bad happens, and then you wan to know about it so you can investigate. You can then rehearse a fix on Report Server version, and once that is sorted reapply the same steps on the production server. |
|
|
Jamsan
Starting Member
7 Posts |
Posted - 2010-11-17 : 13:42:44
|
Tara / Kristen - The makes sense, thanks.What do you typically do in the case of runaway query? Is it OK to truncate the log in that situation (if the log file is say only typically 5 GB and the query bloats it to 20 GB)?When the 15 min tlog backup job and new reindex/reorganize job goes into production, would it be best to re-size the logs appropriately at that point to start fresh (i.e. unbloated from previous increases in size)?Thanks again. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2010-11-17 : 13:47:04
|
I never truncate the log. If there's a runaway query that caused it to bloat, then you can shrink it via DBCC SHRINKFILE. Resizing them is dependent upon the biggest index. You'll need 125% free space inside the MDF file and 100% free space inside the LDF file to rebuild any particular index. If multiple index rebuilds kickoff before a tlog backup, then your files need to accomodate all of those too. I'd say leave them where they are and view the used space while the job is running. Then you'll know what you need going forward. And note that the custom code is dynamic. You could be rebuilding different indexes each time, just depends on the state of the indexes and what thresholds you use.Tara KizerMicrosoft MVP for Windows Server System - SQL Serverhttp://weblogs.sqlteam.com/tarad/Subscribe to my blog |
|
|
Jamsan
Starting Member
7 Posts |
Posted - 2010-11-17 : 14:33:47
|
Tara - Thank you again for the information.One last question - On your recommended list of production jobs, it appears you have the ALTER INDEX job running daily/weekly with a minimum fragmentatition of 50% and performing a rebuild. I've heard recommendations (both in this thread and those same recommendations elsewhere) of doing a reorganize on 10-30% fragmentation, and a rebuild for 30+. Are you still following your method of rebuilding for 50%+ (or possibly, I'm misreading / misunderstanding your SP and you're actually doing the latter).Thanks! |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2010-11-17 : 14:51:52
|
There isn't a good answer for the thresholds. Everyone's environments are different. You will need to test in order to determine what works best for your environment. The test would involve checking performance at various fragmentation levels. When performance is impacted by fragmentation, then you should rebuild/reorganize. Some people argue that fragmentation very rarely affects performance, so they don't see the need for a scheduled job like this.On most of my systems, we aren't doing any rebuilds and are only doing reorganize. We've got the threshold set to various numbers, but most typical is 30%. On the systems where we are doing rebuilds, I have some set to 50, some set to 75, and some set to 10.Tara KizerMicrosoft MVP for Windows Server System - SQL Serverhttp://weblogs.sqlteam.com/tarad/Subscribe to my blog |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
Next Page
|