Author |
Topic |
tross
Starting Member
20 Posts |
Posted - 2009-03-12 : 08:45:53
|
not sure what to look at but as of Tuesday the rebuild of the cluster index went from 20 minutes to 1 hour 30 min.Nothing appears to be different, no now indexes , no new job.what can I use to find out what is causing the longer rebuild time?I don’t see anything blocking or running at the same time.I shrank the log file.I'm going to try to reboot the server(this weekend).server info:sql server ver 2000 8.00.760windows server 2000clustered index info: 5 columns for 32 char, fillfactor=90 , statistic_norecomputethe 7 job steps are: ( the whole process took 3:00 hours to run now 6:20 )step 1: drop the two indexes ( drop secondary then clustered)step 2: alter the table to drop a columnstep 3: truncate and reload the rows from the production DB step 4: alter the table add the column back (claim)step 5: recreate the data in (claim) columnstep 6: recreate the data in "date received" column.step 7: recreate the two indexes.( recreate clustered then secondary ) the secondary index only take a few minutes the clustered index is the only that run for an hour or more.There are currently 2,400,000 rows to index. not sure what the count was on monday(there are about 100 – 300 rows added per day) like I mentioned, I don’t see where anything changed. |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2009-03-12 : 08:48:35
|
Shrinking the log file is probable the culprit.Almost all actions mentioned in your list above need to be logged, and thus SQL Server must increase log file size whenever it needs.So now your log file is fragmentented and slow to use. E 12°55'05.63"N 56°04'39.26" |
|
|
tross
Starting Member
20 Posts |
Posted - 2009-03-12 : 09:36:07
|
if your saying that shrinking the log increased the time, I shrank the log after the time jumped and it still runs long.Monday : jobs where finished by 5:30 am est. (normal time)Tuesday: jobs where finished by 7:30 am est. 2 hour jumpWendsday: jobs where finished by 8:30 am est. manually shrank database logsThursday: jobs where finished by 8:30 am est.nightly the server run a maintenance plan if the log file gets large, I manually shrink them.here is the command used to shrink the log. backup log datasource with truncate only bdcc shrinkfile datsource_log, 600(if you concerned about losing some logged transactions, its a datawarehouse(history only) and transaction logging is 99% selects no adds or deletes, so recovery plans are, rerun the nightly jobs to rebuild it) the database size:data1 22054 MB data2 3800 MB log1 2096 MB ( was around 6000 to 8000)log2 1919 MBthe following have not changed:data set to autogrowlogs set to autogrowdatabase set to autoshrinkauto recreate statauto update statTim |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2009-03-12 : 10:26:18
|
Some of the items in your process make no sense.1. There is no reason to drop a clustered index and then add it again, because it forces a rebuild of the table each time to go from clustered to heap and then from heap back to clustered.2. Why alter the table to drop a column and then add it again later?Why not just do this:1. Truncate the table and reload the rows from the production DB 2. Recreate the data in claim and "date received" columns (whatever that involves)3. Run DBCC REINDEX on the table.CODO ERGO SUM |
|
|
tosscrosby
Aged Yak Warrior
676 Posts |
Posted - 2009-03-12 : 11:39:56
|
quote: Originally posted by tross database set to autoshrinkTim
And turn that off. It will fragment your indexes if it does shrink, will create a performance hit and is not considered a best practice in the MS SQL world.Terry-- Procrastinate now! |
|
|
tross
Starting Member
20 Posts |
Posted - 2009-03-12 : 14:57:32
|
1) this was the way it was last year1. Truncate the table and reload the rows from the production DB 2. Recreate the data in claim and "date received" columns 3. Run DBCC REINDEX on the table.this took 3+ hours to run.when I changed it to what I have now, I was able to take 30 - 40 minutes off the build time. ( 2:30 avg )but if dropping the cluster is a problem, it does not explain why the jump in time now, I think it should have show up when I changed it, unless its tied to the fragmenting issue mentioned a second index was added (before the above change) and I tried to drop only the second index during the rebuild butIt was taking 20 to 30 minute to drop the index, the cluster index was being recreated or so sp_who would indicate and confirmed by another BDA in another forum. ( creating index would be displaied) (dropping the index was before the table was truncated).2)as to dropping the claims column, this is due to :the table in the other database (non-sql healthcare db ) has over 330 columns and does not have the column it is in 5 pieces the the source table, which the original DBA concatenated into the claim column on the sql server. probably to avoid two issues. A)not having to code > 330 names in the "insert into" where "select * from " is quicker. also make the SP smaller, the Healthcare db has all the groups in their own tables and there are 100 groups.select * from ss001 insert into DBclaims select * from ss002 insert into DBclaims is better thanselect year, month, day ,..... from ss001 ...select year, month, day ,..... from ss002 ...B)and not having to code "where claimNo = year+month+day+clmno+lineno" when he wanted to search the sql database.3)as to auto shrink, I agree it's not best, but it beats having the database halting due to no disk space, (the log file was larger than the data file.)this happened 3 times in the past, even with a weekly full backup , a weekly maintenance plan and a weekly bdcc shrink.Tim |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2009-03-12 : 17:12:53
|
1. Drop indexes after you truncate the table, but do not remove the clustered index. Make sure that you load the table in order by clustered index to prevent index fragmentation. After loading the table, run DBCC REINDEX on the table, and then create the non-clustered indexes.2. So you are dropping and adding the column because whoever wrote the procedure that loads it was too inexperienced or lazy to write the insert statement with a column list?You could just have one statement in this form to do it:insert into TargetTable ( Col1, Col2, ... and so on ... ClaimNo, Col331, Col332 )select Col1, Col2, ... and so on ... ClaimNo = year+month+day+clmno+lineno, Col331, Col332from ( select * from table1 union all select * from table2 union all select * from table3 union all ... and so on ... select * from table98 union all select * from table99 union all select * from table100 ) as CombinedAccountsorder by PKColumn1, PKColumn2, PKColumn3 It would be even better to use the select listed above as the source for a DTS package datapump, and set a batch size of around 50,000 to prevent generating large transactions.3. It sounds like your problems with transaction log size are due to not doing enough transaction log backups to keep the transaction log to a reasonable size.Edit:Added table alias CombinedAccounts to code above.CODO ERGO SUM |
|
|
tross
Starting Member
20 Posts |
Posted - 2009-03-13 : 07:53:09
|
Thanks for the suggestions, i'll look into using them.the problem has corrected itself. I rebooted the server and the process is back to 2 hours.the log went from 6000+ mb to 20 MB after the reboot.so either the log was a mess or something else was hindering the process. Thanks for all the help. |
|
|
tross
Starting Member
20 Posts |
Posted - 2009-03-13 : 08:55:18
|
I tried to code the following,I'm getting Incorrect syntax near ')'.select CLYEAR ,CLMONTH ,CLDAY ,CLACLMNO ,CLLINE ,CLGROUP ,... so onCLSRVPLS ,Claim = CLYEAR+CLMONTH+CLDAY+CLACLMNOfrom ( select * from S210608 where clyear not like '18%' union ALL select * from S210306 where clyear not like '18%' union ALL select * from S100395 where clyear not like '18%' )/*order by CLGROUP,CLYEAR ,CLMONTH ,CLDAY ,CLACLMNO*/I tested the select * and they work by themself, that code appears ok TIA |
|
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2009-03-13 : 09:32:04
|
[code]select CLYEAR ,CLMONTH ,CLDAY ,CLACLMNO ,CLLINE ,CLGROUP ,... so onCLSRVPLS ,Claim = CLYEAR+CLMONTH+CLDAY+CLACLMNOfrom ( select * from S210608 where clyear not like '18%' union ALLselect * from S210306 where clyear not like '18%' union ALLselect * from S100395 where clyear not like '18%' )DerivedTableName/*order by CLGROUP,CLYEAR ,CLMONTH ,CLDAY ,CLACLMNO*/[/code] |
|
|
sakets_2000
Master Smack Fu Yak Hacker
1472 Posts |
Posted - 2009-03-13 : 09:34:22
|
Also, I'll suggest that you list the column names instead of using select * in your union. An added column in S210608 will otherwise raise an error. |
|
|
tross
Starting Member
20 Posts |
Posted - 2009-03-13 : 11:24:46
|
it fixed the issue.as torepacing the * for the 330+ column names would make the code quite longI was testing with only three of the 100+ table I need to add to the code. Thanks Again. |
|
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2009-03-13 : 12:21:21
|
If you load the data with a single statement ordered by the columns of the clustered index, you might find that it is not necessary to re-index the table after the load, and that you only need to add the non-clustered indexes.You can use DBCC SHOWCONTIG to look at fragmentation after you load the table. If fragmentation is low, just skip the DBCC REINDEX.CODO ERGO SUM |
|
|
tross
Starting Member
20 Posts |
Posted - 2009-03-13 : 15:11:58
|
thanks , i'll look into using this. |
|
|
|