Partitioning the Data in a TableBy Sean Baird on 20 September 2000 | Tags: Database Design Michael writes "1) Is it ever good database design practice (for speed sake, etc.) to essentially make copies of tables to hold a certain group of data?
Michael -
What you're referring to is an optimization method called horizontal partitioning. That is, a table is split up into multiple smaller tables containing the same number of columns, but fewer rows. Compare this to vertical partitioning, in which the table is split into multiple smaller tables with the same number of rows, but fewer columns. And yes, this design decision is often made to improve performance. Horizontally partitioning a table gives us some advantages:
Why would you want to do this? Well, maybe you have a big table - hundreds of millions of rows, for instance. Or maybe not so many rows, but large rows. Or maybe you have a table in a data warehouse that contains frequently and infrequently accessed rows. All of these situations are candidates for partitioning. Now, there are two big downsides to all of this:
CREATE TABLE Subdiv_ClaireRidgeEstates (SubdivID int, LotID int /*, etc.*/) You may notice that I included the Subdivision ID in each table. This is important; for the partitioned view to work most effectively, the QP must be able to know that each partition table will only contain a certain type of data. To do this, you need to build CHECK constraints on each table on the ID that you're partitioning on. Since you're partitioning the data by subdivision, you will build CHECK constraints on SubdivID:ALTER TABLE Subdiv_ClaireRidgeEstates ADD CONSTRAINT CK_CRE_SubdivID CHECK (SubdivID = 42) You could just as easily partition by using a surrogate key field and assigning a range of key values to each partition table. Or by partitioning on a date and using a range of dates for each partition value. Regardless, you still need those CHECK constraints in place on each table.After actually creating the partition tables, distributing the data, and building the CHECK constraints, building the view is pretty easy. You just SELECT * from each partition table and use UNION ALL to combine the results of the query: Now, if you've been following along with the example, try inserting some sample rows into each table:INSERT Subdiv_ClaireRidgeEstates VALUES (42,9999) Now, turn on the "Show Execution Plan" option in query analyzer, and run the following queries: SELECT * FROM Subdivision WHERE SubdivID = 42 You'll notice that for the first two queries, SQL Server only pulls information from the required partition table. Only in the last query, where we don't filter by SubdivID, does the QP pull data from each partition table. In SQL Server 7.0, you unfortunately cannot update data in a partitioned view. However, this IS possible in SQL Server 2000. Check out SQL Server Books Online (especially if you're going to use distributed partitioned views) for the do's and don'ts of partitioning data. Michael actually had two very good questions. Answering this one wore me out, so I'll have to post the next one tomorrow. :) UPDATED 9/20/00 - here is the followup question, which deals with splitting application databases into multiple similar databases for each client that uses the application. -SQLGuru
|
- Advertisement - |