Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SQL Server Administration (2005)
 Adding a column to VLDB 200GB table

Author  Topic 

sureshot
Yak Posting Veteran

72 Posts

Posted - 2007-05-01 : 00:02:42
The column I'm adding needs to be part of the clustered PK (it will be the last of three columns) so I need to recreate all the indexes.

My DB is set for FULL recovery mode ALLOW_SNAPSHOT_ISOLATION ON. I've tried two methods so far.

Method 1:

BEGIN TRANSACTION
CREATE TABLE dbo.Tmp_copyoftablewithnewfield
(
) ON PRIMARY
IF EXISTS(SELECT * FROM dbo.originaltable)
EXEC('INSERT INTO dbo.Tmp_copyoftablewithnewfield (<original fields>)
SELECT <original fields> FROM dbo.originaltable WITH (HOLDLOCK TABLOCKX)')
GO
DROP TABLE dbo.originaltable
GO
EXECUTE sp_rename N'dbo.Tmp_copyoftablewithnewfield', N'originaltable',
'OBJECT'
GO
<recreate PK constraint>
<rebuild indexes>
COMMIT

Pro's: Lets me add the new field in the spot I'd like it (not a big deal)
Con's: Tons of wasted space and time. It took about 15 hours.

Method 2:
SET XACT_ABORT ON
GO
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
<drop PK constraint>
<drop indexes>

ALTER TABLE [dbo].[originaltable] ADD
[newfield] [tinyint] NOT NULL CONSTRAINT [DF_originaltable_newfield] DEFAULT
((1))

<recreate PK constraint>
<rebuild indexes>
COMMIT TRANSACTION

Pro's: No making a copy of the entire table taking up 200GB more space in the db data file
Con's: My tempdb grew to accomodate the row versioning info for every row in the 200GB table. It took over 30 hours.

A lot of time and disk space is wasted with both.

Since the db is going to be unavailable to users I have some flexibility here. I was considering turning ALLOW_SNAPSHOT_ISOLATION OFF and then trying method 2 again which should stop the versioning in tempdb and then turning it back on.

I was also curious if setting the database recovery mode to SIMPLE would cut down on db log usage and then I could set it back to FULL when done.

Do these really need to be in a transaction? If there's some hardware failure or something unexpected I can just restore from backup and do the conversion again. If the presence of the transaction itself is causing more disk usage for logging or any other slowdown, I think I'd rather do without.

Given the amount of time this conversion takes, I wanted to get some
feedback other than "just try it" before doing any new tests.

Thanks.

byrmol
Shed Building SQL Farmer

1591 Posts

Posted - 2007-05-01 : 00:27:24
Have you considered BCP?

DavidM

Production is just another testing cycle
Go to Top of Page

sureshot
Yak Posting Veteran

72 Posts

Posted - 2007-05-01 : 01:37:31
The 200GB of data is already in there. I'm adding a new column that will also be part of the PK.
Go to Top of Page

dinakar
Master Smack Fu Yak Hacker

2507 Posts

Posted - 2007-05-01 : 01:42:26
I had to do something similar (alter column from INT to BIGINT for a table with over 1 bill rows, table size 250 GB). I created a new table with the structure I wanted, BCP OUT the data into chunks of smaller files, BCP IN into the new table, validate all the data is there, drop the old table, rename the new table back to the original one, reindex the table, add the relationships back. I finished in about 5-6 hrs. Ofcourse it depends on your hardware. We had an HP superdome and I did it during a scheduled production downtime.

Dinakar Nethi
SQL Server MVP
************************
Life is short. Enjoy it.
************************
http://weblogs.sqlteam.com/dinakar/
Go to Top of Page
   

- Advertisement -