SQL for Threaded Discussion (Part II)
By Bill Graziano
on 7 December 2000
| Tags:
Application Design
webguru22 writes "Hi SQL gurus, Here's a twist on the old "threaded message board query" question: I need to come up with a data model that will let me have elements in an "outline" with numbered sub-elements." Actually a paraphrased his question quite a bit. You can see the original question in the forums though.
This is the first question I've pulled out of the forums to answer. I've been wanting to write this article for quite a while now and
webguru22 just gave me a good reason. Earlier I had an
article on a table structure and code for threaded discussions. My goal was to make the queries as simple as possible. This article aims for the same result but keeps your table as simple as possible. It also allows us to do some pretty neat things with recursive stored procedures. My examples try to cover both posts and the hierarchial numbering scheme from the question. It should work for any generic "expanding hierarchy" problem.
We'll start with a table that looks like this:
CREATE TABLE [Posts] (
[PostID] [int] IDENTITY (1, 1) NOT NULL ,
[Subject] [char] (50) NULL ,
[ParentID] [int] NULL ,
[PostSortKey] [datetime] NOT NULL )
GO
ALTER TABLE [Posts] WITH NOCHECK ADD
CONSTRAINT [DF_Posts_SortKey] DEFAULT (getdate()) FOR [PostSortKey]
GO
The PostID
column is our primary key for this table. Subject
is the subject of the post. For simplicity sake I removed all other post related fields that were not absolutely necessary such as author and the body of the post. ParentID
is the PostID of the post for the parent. If this post is a top level post, then this field will be zero. PostSortKey
is the sort order for the posts. This will usually be a datetimefield. In my example here, I used a default of GETDATE() to populate the field. In the specific example of this question it could be whatever the user wanted.
My example will use a dataset that looks like this:
PostID Subject ParentID PostSortKey
----------- -------------------------------------------------- ----------- -----------------------
1 First Post 0 2000-12-06 20:54:29.407
2 Second Post 0 2000-12-06 20:54:29.407
3 Child of First Post 1 2000-12-06 20:54:29.407
4 Child of Second Post 2 2000-12-06 20:54:29.407
5 Child of First Child 3 2000-12-06 20:54:29.407
6 Another FP Child 1 2000-12-06 21:04:49.217
7 Smallest Kid 5 2000-12-06 21:18:49.203
8 Another Munchkin 3 2000-12-06 21:28:22.040
Populating a table like this should be very easy. You just need to insert a record with it's parent ID.
This example is composed of two pieces of code. The first is a SQL Script to create a temporary table and call the stored procedure. It looks like this:
Create Table #NestedPosts (
SortID int IDENTITY (1,1),
PostID int,
PostKey varchar(200),
PostLevel int )
exec getchildren 0, 1, ''
SELECT SortID,
P.PostID,
ParentID,
PostLevel,
PostKey = LEFT(PostKey, 10),
PostSortKey = convert(varchar(19), PostSortKey, 120),
Subject = LEFT( SPACE( (PostLevel-1) * 2 ) + Subject , 40)
FROM #NestedPosts N JOIN Posts P
ON N.PostID = P.PostID
Order by SortID
DROP TABLE #NestedPosts
The temporary table is populated by the stored procedure. The final SELECT prints out the results. Next is the stored procedure. It looks like this:
CREATE PROC GetChildren
(@ParentID int, @PostLevel int, @ParentPostKey varchar(200) ) AS
SET NOCOUNT ON
DECLARE @NextLevel int, @Counter int, @PostKey varchar(200)
SET @Counter = 1
-- Build a cursor to loop through all the kids of this post
DECLARE c1 CURSOR LOCAL FOR
SELECT PostID
FROM Posts
WHERE ParentID = @ParentID
Order by PostSortKey ASC
OPEN c1
FETCH NEXT FROM c1
INTO @ParentID
WHILE @@FETCH_STATUS = 0
BEGIN
-- Build a key up as we go
IF @PostLevel = 1
SET @PostKey = convert(varchar, @Counter)
ELSE
SET @PostKey = @ParentPostKey + '.' + convert(varchar, @Counter)
-- Put this record in the temp table
INSERT #NestedPosts (PostID, PostKey, PostLevel) VALUES (@ParentID, @PostKey, @PostLevel)
SET @NextLevel = @PostLevel + 1
-- Process all the children for this post
EXEC GetChildren @ParentID, @NextLevel, @PostKey
SET @Counter = @Counter + 1
-- And get the next record at this level
FETCH NEXT FROM c1
INTO @ParentID
END
CLOSE c1
DEALLOCATE c1
SET NOCOUNT OFF
This stored procedure runs through all the child posts for a given parent post and puts them into the temp table. It also builds up the hierarchical numbering scheme that was asked about in the question. The really tricky part is that it calls itself in order to handle any child records of the current record it's processing. You can have up to 32 levels of nesting in your stored procedure. That means this little sample is limited to 32 levels of posts. I also could have used a variable called @@NESTLEVEL
which SQL Server uses to track how far down a recursion tree you are. I built and manually maintained a variable called @PostLevel
for the task.
The output from the SELECT statement above looks like this:
SortID PostID ParentID PostLevel PostKey PostSortKey Subject
----------- ----------- ----------- ----------- ---------- ------------------- -------------------------
1 1 0 1 1 2000-12-06 20:54:29 First Post
2 3 1 2 1.1 2000-12-06 20:54:29 Child of First Post
3 5 3 3 1.1.1 2000-12-06 20:54:29 Child of First Child
4 7 5 4 1.1.1.1 2000-12-06 21:18:49 Smallest Kid
5 8 3 3 1.1.2 2000-12-06 21:28:22 Another Munchkin
6 6 1 2 1.2 2000-12-06 21:04:49 Another FP Child
7 2 0 1 2 2000-12-06 20:54:29 Second Post
8 4 2 2 2.1 2000-12-06 20:54:29 Child of Second Post
That's about all there is to this. Try it and let me know what you think.
-graz