Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Administration (2000)
 Decreasing database size puzzle

Author  Topic 

sql_er
Constraint Violating Yak Guru

267 Posts

Posted - 2009-02-04 : 15:04:17
Hello,

We are using SQL Server 2000. We have a database which has been growing for for the last 3 years, and is currently at 64GB. Upon investigation, I found out that the database has many tables, and one of them - table "A" - is the largest, and is taking up ~ 40GB (20GB for data and 20GB for indexes). This table has 15 columns and we keep data in it forever (i.e. do not purge it).

After talking to developers, I found out that we only need 2 columns forever, and others only for 1 year. Thus came a question of how to use this knowledge to decrease the table (database) size. The goal was to come up with an approach which is easy to maintain, and would decrease the space usage as much as possible, and improve overall performance. We brainstormed 3 approaches and they were:

Approach 1
----------
1. Create "A_Archive" table with 2 columns only
2. Move the data from table "A" to "A_Archive" which is older than 1 year
3. Purge data from the "A" table, which is older than 1 year
4. Create a job, which would periodically (daily?) move the data from table "A" to table "A_Archive" and purge day's worth of old data from table "A"

Approach 2
----------
1. Create "A_Archive" table with 2 columns only
2. Move ALL the data from "A" to "A_Archive"
3. Purge data from the "A" table, which is older than 1 year
4. Modify the stored procedure which inserts into "A" table to simultaneously insert into the "A_Archive" table

Approach 3
----------
1. Change the columns in the "A" table to be NULLable (those which are not)
2. NULL the data which is > 1 year old, for all except the 2 columns we need to keep forever

Here are my thoughts:

Approach 1 decreases more space than Approach 2
Approach 2 is easier to maintain than Approach 1
Approach 3 I don't like, as it would require changing the columns to be NULLable

My questions are:

1. Which of these approaches would you advise?
2. Are there any other possibilities that any of you used to achieve the same goal?
3. When the data is NULLed (as in Approach 3), is the space really released and database size decreases? If so, does it depend on where in the table the column is - that is (last column or middle column) or not?

Thanks a lot in advance!

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2009-02-04 : 16:02:22
I would go with number Approach 1.

Stay away from Approach 3. Doing that is likely to result in horrible query plans with terrible performance. When SQL Server sees in the statistics that the vast majority of columns are nullable, you will likely get table scans on every query.






CODO ERGO SUM
Go to Top of Page

sql_er
Constraint Violating Yak Guru

267 Posts

Posted - 2009-02-04 : 16:25:01
Michael, thank you for advice in regards to Approach 3.

Could you please elaborate as to why you would not go with Approach 2?

Just an additional information: A_Archive table's 2 columns are INT and VARCHAR(255). And, on a daily basis, we insert ~ 150,000 records into table "A".
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2009-02-04 : 16:50:18
quote:
Originally posted by sql_er

Michael, thank you for advice in regards to Approach 3.

Could you please elaborate as to why you would not go with Approach 2?

Just an additional information: A_Archive table's 2 columns are INT and VARCHAR(255). And, on a daily basis, we insert ~ 150,000 records into table "A".



Approach 2 stores the same data twice, so that seems like a waste. Also, it is doing more work as you insert indidual rows into the table.





CODO ERGO SUM
Go to Top of Page
   

- Advertisement -