Author |
Topic |
cgunner
Yak Posting Veteran
95 Posts |
Posted - 2005-08-18 : 08:14:17
|
Is there a rule of thumb when you would NOT drop/recreate indexes when loading a small % of total rows into a very large table? On large imports of rows, drop/recreate indexes produces a large impact on the system. I need find alternatives to reduce the impact on the transaction log and system resources. Thanks.  |
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-08-18 : 08:25:24
|
One my be: Don't do that under such curcumstances. Just update statistics and maybe defragment indexes instead of rebuilding after the load. Rebuild indexes occasionaly when fragmentation is above acceptable level and define a proper fill-factor.However, it's better to test a few scenarios and see what works best for you. |
 |
|
cgunner
Yak Posting Veteran
95 Posts |
Posted - 2005-08-18 : 08:46:14
|
I am mostly concerned with system performance and impact. I know it would be slower to import data with the indexes inplace and fragmentation would happen. On the other hand, dropping the indexes places such a load on the system. I need to find the happy medium without killing the server. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2005-08-18 : 08:49:41
|
mmarovic: I reckon you are answering a different question I would rarely drop a clustered index, but I would usually pre-sort the imported data according to any clustered index.The other indexes I would drop based on a test! How long does it take to drop & recreate?How many rows import per minute WITH indexes in place?How many per minute WITHOUT?Now you can calculate how many rows you have to import before its worthwhile pre-dropping indexesAnyway, that what I think I would do Kristen |
 |
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-08-18 : 09:01:49
|
Not really different. I mean I would not drop indexes if I just load deltas and performance is acceptable, but I would take care about statistics and index fragmentation the way I described. However, I like your recommendations. As I already said, there are a few things that can be tested in order to find a solution that is the best fit. |
 |
|
cgunner
Yak Posting Veteran
95 Posts |
Posted - 2005-08-18 : 09:04:03
|
Thanks Kristen. It takes about an hour to drop and an hour to recreate. This is a data wherehouse. There are 15 indexes on the whole cube, one of them containing a huge amount of the table fields. We are bring in about 200,000 records into a cube of 5 million records. We have not tried to do it with the indexes in place. The biggest issue is the generation of transaction log size during the rebuild of the indexes and system performance. Would there be more impact if we left them in place? |
 |
|
Kristen
Test
22859 Posts |
Posted - 2005-08-18 : 09:23:29
|
If your log file size is not an issue then it won't get any bigger next time.For a data warehouse if you are using RECOVERY MODEL = SIMPLE (No transaction backups) then I don't suppose there will be any issue with Backup filesizesMake sure there is a COMMIT between each index create - to release the log space perhaps?I can't see importing 200,000 rows with 15 indexes in place taking more than 2 hours so sounds like you are better off having the indexes in place when you import, provided that it doesn't fragment the indexes to the extent that that causes a real-world performance deterioration depending on RAM and caching etc - the index rebuild utilities can give you statistics on fragmentation.(We rebuild, or defrag, all indexes based on fragmentation stats - most times when the process runs it only finds a few indexes that need attention)Kristen |
 |
|
cgunner
Yak Posting Veteran
95 Posts |
Posted - 2005-08-18 : 09:40:16
|
We have the DB in RECOVERY MODEL = FULL for business reasons. Transaction log size will not be effected if the indexes remain in place any differently than if they were dropped?I appreaciate the other tips. We will give them a try during our testing. Thank you very much. |
 |
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2005-08-18 : 10:01:48
|
You could switch back and forth between recovery models IF there was no other (updating) activity on the system during the data loads...and that the switching process and the controls implemented around it provided a level of data security, performance and system reliability that made it worthwhile. |
 |
|
cgunner
Yak Posting Veteran
95 Posts |
Posted - 2005-08-18 : 10:20:20
|
Hi AndrewMurphy. If we did the load in SIMPLE vice FULL, we would have to do a backup of the database after the load. The DB is to large and we load often. |
 |
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-08-18 : 10:33:57
|
Then you can try to increase tran log backup frequency and to avoid long running transactions. To do that you can load data in batches of let's say 5000 rows. As I said before, avoid droping/(re)creating indexes. You may defragment some of them. Index rebuild can be separate task, that will run less frequently, certainly not after each load and not all indexes each time just heavily fragmented at the moment. |
 |
|
cgunner
Yak Posting Veteran
95 Posts |
Posted - 2005-08-18 : 11:09:32
|
Hi mmarovic. We backup the transaction logs every 10 minutes all the time and log ship them to our DR Wherehouse. They get very large even with that short of an interval. We were dropping the indexes to increase speed but the recreation of them gives us large transaction logs for an additional hour after the load. This is taxing on the server. I want to do loads as evasively as possible. |
 |
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-08-18 : 11:22:23
|
quote: We backup the transaction logs every 10 minutes all the time and log ship them to our DR Wherehouse.
We did the same, I'm familiar with problem you describe. quote: They get very large even with that short of an interval. We were dropping the indexes to increase speed but the recreation of them gives us large transaction logs for an additional hour after the load. This is taxing on the server. I want to do loads as evasively as possible.
That's exactly what I'm talking about. However, I was wrong about index defragmentation (I don't work as dba last few months so I obviously started forgeting...), it is better to rebuild index if it is fragmented from log-production point of view. It is clear that you have to find the right balance between load optimizing speed (the less indexes the better) and minimizing log production (the less index dropping/creating the bettter). So it is up to you to test and find the balance. |
 |
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-08-18 : 11:24:47
|
On top of that, review index design and see if you can decrease number of indexes without significantly decreasing performance. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2005-08-18 : 12:04:14
|
Does moving the non clustered indexes to a different logic file make a difference to any of this?You guys are talking bigger than I have to support, so I'm out of my depth, but I'm thinking:1) Index fragmentation in a separate logic unit may be less, and thus rebuild may be required less often (I think I read this at least!!)2) Maybe log traffic is less too (I have absolutely no idea on this one, maybe its not a factor of logic units)3) Maybe you already did all this for performance reasons etc. and I'm getting you all back to Square One!Kristen |
 |
|
cgunner
Yak Posting Veteran
95 Posts |
Posted - 2005-08-18 : 12:42:08
|
So, I guess there is no rule of thumb when you would NOT drop/recreate indexes. It is just a matter of testing.I will leave the indexes in place, run the load, check fragmentation (defrag if neccessary) and also check the speed of the load vice dropping the indexes.If the speed is very slow with indexes in place, I will drop indexes, do the load and rebuild with a commit after each index rebuild.Thanks sooo much everyone for your input! |
 |
|
mmarovic
Aged Yak Warrior
518 Posts |
Posted - 2005-08-18 : 17:06:41
|
Kristen: separating indexes in their own filegroups helps performance but doesn't affect fragmentation.cgunner - Just a few more hints:1. You don't have to drop or keep all indexes, you can drop/recreate some of them.2. When you find optimal fill-factor, you will not need to rebuild indexes after each load. So you can check fragmentation before the load and drop heavily fragmented indexes before the load, then create them again after. 3. I remember article on sql-server-performance.com about automating index rebuilding based on fragmentation level. There is also script included. You can try to borrow ideas from there and incorporate into your solution. Author is Tom Pulen (or something like that). |
 |
|
Kristen
Test
22859 Posts |
|
cgunner
Yak Posting Veteran
95 Posts |
Posted - 2005-08-20 : 09:16:22
|
Thanks Kristen and mmarovic for your help. Your tips are greatly appreaciated. |
 |
|
|