Author |
Topic |
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-27 : 12:31:45
|
I have a large database (over 250 GB) and about 1/2 of it is one table.I have SQL Server 2000.The table has about 4 years of data in it, and I would only need to keep about 1 1/2 to 2 years of data.I have tried to remove old data in the past but it takes about a day to remove 10 days worth.Is there a good way to copy that table to the new database and at the same time only keep the last 2 years or so?Also, the current database has a primary key of autonumber, and is heavily indexed.I am thinking to keep the primary key as autonumber, and cluster on date. Leave the other indexes off during the load, and add them later?To make a real unique primary key would require many columns.I am really inerested in the fastest way to do this.We are having issues with this database, and I think cutting its size in more that half will help us deal with them.Seveal reports with have to have their ODBC changed, but I feel that the effort is worth it. |
|
nigelrivett
Master Smack Fu Yak Hacker
3385 Posts |
Posted - 2011-10-27 : 12:57:17
|
Do it by day or hour or whatever - put an index on date (not clustered)I haven't tried any of this so test it first.declare @d datetime, @step datetime = '01:00:00'declare @enddate datetmie = '20080101'select pk into #a from tbl where 1=0select @d = min(date) from tblwhile @d+@step < @enddatebegindelete #aselect @d = @d + @stepselect pk into #a where date < @dinsert mynewdb..mynewtbl select * from tbl where pk in (select pk from #a)delete tbl where pk in (select pk from #a)end==========================================Cursors are useful if you don't know sql.SSIS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-28 : 06:50:31
|
I was hoping for a way to do it in one session.I do know SQL and I do know VBA.If copying the whole table to a brand new database would be better, that is what I will do. I would rather delete from the table when it is the only thing in the new database than delete from the table when it is part of a database with many other tables.If copying the whole table is better, what would be the best way to do that (table probably is 125 GB+) ? |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-28 : 08:51:21
|
Would DTS be a good way to do this? |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-28 : 12:20:07
|
How about a VBA date loop (in Access passthrough query, day at a time) select into new table ... select * from old table ... Where DelDate = 'date' ?only thing is both and new server would have autonumber (identity fields) as primary keyMaybe same thing but in stored procedure (not sure how to code that)Both databases would be on the same server. |
|
|
nigelrivett
Master Smack Fu Yak Hacker
3385 Posts |
Posted - 2011-10-28 : 12:24:35
|
What do you mean by one session?You could put a transaction around what I did if you are worried about rollbacks - and it should have a try catch block to trap errors.==========================================Cursors are useful if you don't know sql.SSIS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-28 : 12:33:17
|
What I meant was not do a day, wait till done, do another. I would like to try and get through copying the year or two of data in a reasonable amount of time. Not spend a month doing it. |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-28 : 12:37:39
|
Have about 500,000 long records per day |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-28 : 12:40:56
|
My thinking was a day at a time (simple recovery mode) would keep the tranaction log from growing to unmanagable size. |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2011-10-28 : 13:09:37
|
You are going to have to worry about the log and do a commit and backup so your log doesn't grow to many many GB'sHow about thisCreate a new table (or select * into..)Insert the rows you want to keep there (How many rows is 2 years of data?) Maybe doSELECT * INTO newTable FROM OldTable Where Dates = ???? (min logged)bcp out the data from the old tablerobocopy it off the serverdrop the old tablesp_rename the new table to the old nameBrett8-)Hint: Want your questions answered fast? Follow the direction in this linkhttp://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspxWant to help yourself?http://msdn.microsoft.com/en-us/library/ms130214.aspxhttp://weblogs.sqlteam.com/brettk/http://brettkaiser.blogspot.com/ |
|
|
Kristen
Test
22859 Posts |
Posted - 2011-10-28 : 14:32:35
|
The "old hand's" way of doing this is either:1) If the percentage you want to keep is, say, less than 10% then copy the "good" data to a new temporary table, drop the old table, rename the temporary-table to the original table name.You need to lock everyone out whilst this is done. Easiest way is to use scheduled downtime if you have it. This method is also complicated by needing to drop, and recreate, all Constraints, Indexes, Primary keys, Foreign keys, etc. associated with the table - so if you are not familiar with the process (and you have said you aren't) this method could carry some risk.2) Delete the stale-data. This can be done, relatively easily, in a way that will "gently" delete the old data (over hours, days or even weeks if necessary) so that there is no/little impact on people using the system, nor a sudden massive impact on the database logging system. |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-31 : 08:12:36
|
Since I want to move less than half of a large table...I am thinking of copying it to the new table with just an autonumber(identity) field as primary key.I plan to cluster on date. Would it be better to do that as I copy day by day, or after I finish?I also plan to add many indexes when I am done.I plan to change the ODBC links in the Access programs that use the table after the full table is copied.(There are no foreign keys defined in SQL Server, nor any user defined functions or stored procedures) |
|
|
nigelrivett
Master Smack Fu Yak Hacker
3385 Posts |
Posted - 2011-10-31 : 08:16:33
|
If you are concerned about performance I would consider partitioning then you can swap partitions in and out and copy without having to worry about deletes.Can use bulk copies for moving the data.Not really sure what the issue is though as it seems to be changing.If it's just the speed of the copy then bcp the data that you need out and in to the new database.==========================================Cursors are useful if you don't know sql.SSIS can be used in a similar way.Beer is not cold and it isn't fizzy. |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-10-31 : 08:20:06
|
You should look in to partitioning. Even though you are on 2000 you can still do partitioning although it's more manual than in 2005/2008. Check this article on my blog:-> http://thefirstsql.com/2010/06/18/partitioning-the-old-fashion-way/- LumbagoMy blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/ |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-31 : 08:43:02
|
I took a quick look at the partitioning. Not too clear to me. Also, I am only going to have the operating system(c)/data disk(d) and tranaction log disk(f). If we get a real server instead of workstation, those would just be mirrored (RAID).Not sure if worth the extra confusion.How about the question on when to index, especially when add clustering index? |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-31 : 08:44:22
|
Also, not sure if I mentioned this, but we are running Standard Edition. |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-10-31 : 08:50:25
|
quote: Originally posted by rocksteer I took a quick look at the partitioning. Not too clear to me.
Then you really should take a closer look. What you basically do is to physically split your data table in to several tables, each with say one months worth of data, and then work with them through a view. Handling tables with 15 mill rows is FAR easier than tables with hundreds of millions...- LumbagoMy blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/ |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-31 : 09:04:28
|
I did not plan on creating any views, and if partitioned would the using reports have to have any knowledge of the partitions? I don't want to have to rewrite the reports currently using the table. |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-10-31 : 12:30:11
|
Just give the view the same name as the old table (you'd of course have to rename the old table) and it should work fine. You should do some thorough testing first though...but if you gt partitioning to work your life will be *a lot* easier.- LumbagoMy blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/ |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-31 : 12:47:00
|
Still no answer to when to cluster, after load or during load? |
|
|
rocksteer
Yak Posting Veteran
69 Posts |
Posted - 2011-10-31 : 12:49:41
|
Also, about partition. Is it transparent to user? (need user pick view month, or just by entering date, specific month partition is picked?)If wanted to delete rolling month of data 1 1/2 back, good solution? |
|
|
Next Page
|