Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Help converting cursor logic into a sql

Author  Topic 

dev2dev
Starting Member

48 Posts

Posted - 2007-12-13 : 11:19:29
Hello people,

I have a stored procedure that loops through millions of records, what makes even worse is its a dynamic cursor (dynamic sql is used in un avoidable circumstance), it is already running in production and our client is happy (performance wise)

now I need your help converting this cursor logic into sql query, It may not be using one single query, but its okey for me to use temporary/intermediate tables for achieving it

my table has two columns
SeqNo (int) -- this is running sequence# (unique)
IndexNo (int) -- this can be 0 to 3
(there are other columns like 'time' which is also unique and lots of other columns with other info)

====

this is how sample data looks

SeqNo IndexNo (just comments, not a db column)
=========================================
1 0 -- 1st Task of cycle 1 started here
2 0 -- 1st task of cycle 1 ended here
3 1 -- 2nd task of cycle 1 started here
4 1
5 1 -- 2nd task of cycle 1 started here
6 2
7 2
8 2
9 3
10 3 -- cycle ended (4th task of cycle 1)
11 0 -- cycle 2 (task 1)
12 0
13 0
14 0
15 1
16 1
17 2
18 2
19 3 -- cycle 2 (task 4)
20 0 -- cycle 3 (task 1)
21 1

and so on.....

(note: cycle is not important here, need to see only the task switch
actual IndexNo is between 0-10,for simplicity i gave 0-3
it is not compolsory that in a cycle, there are always 0-10 TaskIndex available, there are chances that few TaskIndex can miss, but they are always in sequence, i.e, TaskIndex 0 followed by 1 and it followed by 2 and so on)

This data set contains rows pertaining to all cycles where each cycle has 4 tasks with 1-n rows per task(i hope u understood)

now i have much more complex things to do with this data but If you can help me to derive a new datasets from this with one new derived column 'TaskNo'

there are 10 Tasks in the above sample data, the result set should be like below


SeqNo IndexNo TaskNo
====================
1 0 1
2 0 1
3 1 2
4 1 2
5 1 3
6 2 4
7 2 4
8 2 4
9 3 5
10 3 5
11 0 6
12 0 6
13 0 6
14 0 6
15 1 7
16 1 7
17 2 8
18 2 8
19 3 9
20 0 10


SeqNo 1-2 : 1
SeqNo 3-2 : 2
SeqNo 6-2 : 3
SeqNo 9-2 : 4
SeqNo 11-2 : 5
SeqNo 15-2 : 6
SeqNo 17-2 : 7
SeqNo 19-2 : 8
SeqNo 20-2 : 9
SeqNo 21-2 : 10

if you see the cursor logic

declare taskno,CurTaskIndex,PrevTaskIndex
taskno = 0
PrevTaskIndex = -1
loop through cursor (CurTaskIndex = TaskIndex)
if CurTaskIndex <> PrevTaskIndex then
TaskNo = TaskNo+1
end if
end loop

Is it possible to put the above logic in a SQL query(s) using intermediate/temporary tables?

Please ask me if anything is not clear

dev2dev
Starting Member

48 Posts

Posted - 2007-12-13 : 11:23:56
I have liberty to use SQL2K5

Thanks in advance
Go to Top of Page

dev2dev
Starting Member

48 Posts

Posted - 2007-12-27 : 00:41:29
I have done it! I achieved it using recursive CTE. I will post the complete script/sql
Go to Top of Page

dev2dev
Starting Member

48 Posts

Posted - 2007-12-27 : 08:00:03
here it is with complete schema and sample data
======

--
-- DDL
--
if exists (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[TMP_TASKS]') AND OBJECTPROPERTY(id, N'IsUserTable') = 1)
DROP TABLE dbo.TMP_TASKS
go
CREATE TABLE dbo.TMP_TASKS (
SEQ_NO int IDENTITY NOT NULL,
INDEX_NO int NOT NULL,
)
--
-- Sample data
--
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (2)
INSERT INTO TMP_TASKS VALUES (2)
INSERT INTO TMP_TASKS VALUES (2)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (2)
INSERT INTO TMP_TASKS VALUES (3)
INSERT INTO TMP_TASKS VALUES (3)
INSERT INTO TMP_TASKS VALUES (3)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (2)
INSERT INTO TMP_TASKS VALUES (2)
INSERT INTO TMP_TASKS VALUES (3)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (0)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (1)
INSERT INTO TMP_TASKS VALUES (2)
INSERT INTO TMP_TASKS VALUES (2)
--
-- View (coz CTE is used to write reusable and complex queries
-- but unfortunatly a a recursive CTE has lot of restrictions such as,
-- cant have outer join and cte cannot be in subquery which is very much needed for this solution)
--
if exists (SELECT * FROM dbo.sysobjects WHERE id = OBJECT_ID(N'[dbo].[VW_TMP_TASKS]') AND OBJECTPROPERTY(id, N'IsView') = 1)
DROP VIEW dbo.VW_TMP_TASKS
go
CREATE VIEW dbo.VW_TMP_TASKS AS
SELECT A.SEQ_NO,A.INDEX_NO CUR_IDX_NO,ISNULL(B.INDEX_NO,-1) NXT_IDX_NO
FROM TMP_TASKS A
LEFT OUTER JOIN TMP_TASKS B
ON A.SEQ_NO = B.SEQ_NO + 1
GO
--
-- Actual query (recursive CTE)
--
WITH CTE_TASKS (SEQ_NO,CUR_IDX_NO,NXT_IDX_NO,CYCLE_ID) AS (
-- Anchor member (god i wasted lot of time by jumping into writing recursive query without understanding it throughly)
SELECT SEQ_NO,CUR_IDX_NO,NXT_IDX_NO,1 CYCLE_ID
FROM VW_TMP_TASKS
WHERE SEQ_NO = (SELECT MIN(SEQ_NO)
FROM VW_TMP_TASKS)
UNION ALL
-- Recursive member
SELECT A.SEQ_NO,A.CUR_IDX_NO,A.NXT_IDX_NO,
CASE WHEN A.CUR_IDX_NO=A.NXT_IDX_NO THEN B.CYCLE_ID ELSE B.CYCLE_ID+1 END
FROM VW_TMP_TASKS A
INNER JOIN CTE_TASKS B
ON A.SEQ_NO = B.SEQ_NO+1
)
--

-- Now select

SELECT SEQ_NO,CUR_IDX_NO,CYCLE_ID FROM CTE_TASKS

-- Select first and last sequence number from each task cycle

SELECT CYCLE_ID,MIN(SEQ_NO)FIRST_SEQ_NO,MAX(SEQ_NO)LAST_SEQ_NO
FROM CTE_TASKS
GROUP BY CYCLE_ID

======
Go to Top of Page

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2007-12-29 : 20:43:36
How long does your loop code currently take that the customer is "happy" with? Even in the presence of a Clustered Index on Seq_No, the recurrsive CTE/View combination makes 3 index scans and 1 index seek... I wouldn't be surprised if the R-CTE/View combo was actually slower than the loop.

Tell me how many millions of rows you have so I can setup a test for a set-based method.

--Jeff Moden
Go to Top of Page

dev2dev
Starting Member

48 Posts

Posted - 2007-12-31 : 02:37:14
it ranges from 15K to 150K

mostly we get around a million rows

what i explained here is very less, there are multiple insert statements goes on when ever there is change in a cycle and of course few more which depends on other columns, so basically i wanted to avoid cursor loop and multiple inserts which i felt can be done in few insert statements

currently it takes around 5-15 minutes to finish

yes the recursive cte could be even more expensive with this huge data, but i am hoping for gain in overall performance

thanks
Go to Top of Page

dev2dev
Starting Member

48 Posts

Posted - 2007-12-31 : 02:46:34
have i told you this? all this happens in dynamic sql
Go to Top of Page

Jeff Moden
Aged Yak Warrior

652 Posts

Posted - 2007-12-31 : 12:53:17
No problem... just wanted you to know that Recurssion is just another form of looping with all the performance drains associated with looping.

--Jeff Moden
Go to Top of Page
   

- Advertisement -