Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Administration
 Split out large table from database to new databas

Author  Topic 

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-27 : 12:31:45
I have a large database (over 250 GB) and about 1/2 of it is one table.

I have SQL Server 2000.

The table has about 4 years of data in it, and I would only need to keep about 1 1/2 to 2 years of data.

I have tried to remove old data in the past but it takes about a day to remove 10 days worth.

Is there a good way to copy that table to the new database and at the same time only keep the last 2 years or so?

Also, the current database has a primary key of autonumber, and is heavily indexed.

I am thinking to keep the primary key as autonumber, and cluster on date. Leave the other indexes off during the load, and add them later?

To make a real unique primary key would require many columns.

I am really inerested in the fastest way to do this.

We are having issues with this database, and I think cutting its size in more that half will help us deal with them.

Seveal reports with have to have their ODBC changed, but I feel that the effort is worth it.

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-10-27 : 12:57:17
Do it by day or hour or whatever - put an index on date (not clustered)
I haven't tried any of this so test it first.

declare @d datetime, @step datetime = '01:00:00'
declare @enddate datetmie = '20080101'
select pk into #a from tbl where 1=0
select @d = min(date) from tbl
while @d+@step < @enddate
begin
delete #a
select @d = @d + @step
select pk into #a where date < @d
insert mynewdb..mynewtbl select * from tbl where pk in (select pk from #a)
delete tbl where pk in (select pk from #a)
end


==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-28 : 06:50:31
I was hoping for a way to do it in one session.

I do know SQL and I do know VBA.

If copying the whole table to a brand new database would be better, that is what I will do.

I would rather delete from the table when it is the only thing in the new database than delete from the table when it is part of a database with many other tables.

If copying the whole table is better, what would be the best way to do that (table probably is 125 GB+) ?
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-28 : 08:51:21
Would DTS be a good way to do this?
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-28 : 12:20:07
How about a VBA date loop (in Access passthrough query, day at a time)
select into new table ...
select * from old table ...
Where DelDate = 'date' ?

only thing is both and new server would have autonumber (identity fields) as primary key

Maybe same thing but in stored procedure (not sure how to code that)

Both databases would be on the same server.






Go to Top of Page

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-10-28 : 12:24:35
What do you mean by one session?
You could put a transaction around what I did if you are worried about rollbacks - and it should have a try catch block to trap errors.

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-28 : 12:33:17
What I meant was not do a day, wait till done, do another. I would like to try and get through copying the year or two of data in a reasonable amount of time. Not spend a month doing it.
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-28 : 12:37:39
Have about 500,000 long records per day
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-28 : 12:40:56
My thinking was a day at a time (simple recovery mode) would keep the tranaction log from growing to unmanagable size.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2011-10-28 : 13:09:37
You are going to have to worry about the log and do a commit and backup so your log doesn't grow to many many GB's

How about this

Create a new table (or select * into..)

Insert the rows you want to keep there (How many rows is 2 years of data?) Maybe do

SELECT * INTO newTable FROM OldTable Where Dates = ???? (min logged)

bcp out the data from the old table

robocopy it off the server

drop the old table

sp_rename the new table to the old name



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx


Want to help yourself?

http://msdn.microsoft.com/en-us/library/ms130214.aspx

http://weblogs.sqlteam.com/brettk/

http://brettkaiser.blogspot.com/


Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2011-10-28 : 14:32:35
The "old hand's" way of doing this is either:

1) If the percentage you want to keep is, say, less than 10% then copy the "good" data to a new temporary table, drop the old table, rename the temporary-table to the original table name.

You need to lock everyone out whilst this is done. Easiest way is to use scheduled downtime if you have it. This method is also complicated by needing to drop, and recreate, all Constraints, Indexes, Primary keys, Foreign keys, etc. associated with the table - so if you are not familiar with the process (and you have said you aren't) this method could carry some risk.

2) Delete the stale-data. This can be done, relatively easily, in a way that will "gently" delete the old data (over hours, days or even weeks if necessary) so that there is no/little impact on people using the system, nor a sudden massive impact on the database logging system.
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-31 : 08:12:36
Since I want to move less than half of a large table...

I am thinking of copying it to the new table with just an autonumber(identity) field as primary key.

I plan to cluster on date. Would it be better to do that as I copy day by day, or after I finish?

I also plan to add many indexes when I am done.

I plan to change the ODBC links in the Access programs that use the table after the full table is copied.

(There are no foreign keys defined in SQL Server, nor any user defined functions or stored procedures)
Go to Top of Page

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-10-31 : 08:16:33
If you are concerned about performance I would consider partitioning then you can swap partitions in and out and copy without having to worry about deletes.
Can use bulk copies for moving the data.

Not really sure what the issue is though as it seems to be changing.

If it's just the speed of the copy then bcp the data that you need out and in to the new database.

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-10-31 : 08:20:06
You should look in to partitioning. Even though you are on 2000 you can still do partitioning although it's more manual than in 2005/2008. Check this article on my blog:

-> http://thefirstsql.com/2010/06/18/partitioning-the-old-fashion-way/

- Lumbago
My blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-31 : 08:43:02
I took a quick look at the partitioning. Not too clear to me. Also, I am only going to have the operating system(c)/data disk(d) and tranaction log disk(f). If we get a real server instead of workstation, those would just be mirrored (RAID).

Not sure if worth the extra confusion.

How about the question on when to index, especially when add clustering index?
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-31 : 08:44:22
Also, not sure if I mentioned this, but we are running Standard Edition.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-10-31 : 08:50:25
quote:
Originally posted by rocksteer

I took a quick look at the partitioning. Not too clear to me.
Then you really should take a closer look. What you basically do is to physically split your data table in to several tables, each with say one months worth of data, and then work with them through a view. Handling tables with 15 mill rows is FAR easier than tables with hundreds of millions...

- Lumbago
My blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-31 : 09:04:28
I did not plan on creating any views, and if partitioned would the using reports have to have any knowledge of the partitions? I don't want to have to rewrite the reports currently using the table.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-10-31 : 12:30:11
Just give the view the same name as the old table (you'd of course have to rename the old table) and it should work fine. You should do some thorough testing first though...but if you gt partitioning to work your life will be *a lot* easier.

- Lumbago
My blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-31 : 12:47:00
Still no answer to when to cluster, after load or during load?
Go to Top of Page

rocksteer
Yak Posting Veteran

69 Posts

Posted - 2011-10-31 : 12:49:41
Also, about partition. Is it transparent to user?
(need user pick view month, or just by entering date, specific month partition is picked?)

If wanted to delete rolling month of data 1 1/2 back, good solution?
Go to Top of Page
    Next Page

- Advertisement -