Relationship question - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

Relationship question

Author

Topic

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-08-01 : 12:56:05

Hi,

I have a table of "jobs", each of which can have many "tasks" assigned to it. Each task has a "DueDate" field. I know how to pull out a list of jobs and related tasks using "SELECT <job and task fields> FROM Jobs j INNER JOIN Tasks t ON t.JobId = j.JobId".

But I need to show a list of jobs and only the *first upcoming* task related to it, if any. That is, "WHERE Tasks.DueDate > GETUTCDATE()" but only the *first* one for each job. The resultset I want is a *single* record per job with the selected job and task fields.

Is there a way to do that? I can't work that one out.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-08-01 : 13:07:57

see scenario 2 here

http://visakhm.blogspot.com/2010/01/multipurpose-apply-operator.html

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-08-01 : 13:25:23

Thanks Visakh, but should have said I'm restricted to SQL 2000 here. :(

I thought of one way, which does work when there is at least one upcoming Task record:

SELECT <job and task fields> FROM Jobs j LEFT OUTER JOIN Tasks t ON j.JobId = t.JobId WHERE t.TaskId IN( SELECT TOP 1 t2.TaskId FROM Tasks t2 WHERE t2.JobId = j.JobId AND t2.DueDate > GETUTCDATE())

However if there is *no* upcoming task, it does not return the Job and NULLs for the task, it skips the job record altogether - even though it's an outer join. I'm so close here.. can anyone tell me how to still get the job record even if the subquery doesn't find a task record?

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-08-01 : 13:32:28

Wait.. I got it! I put the subquery in the join condition. That allows the Job record to be found and return null task fields where no task record is found. This works:

SELECT <job and task fields> FROM Jobs j LEFT OUTER JOIN Tasks t ON j.JobId = t.JobId AND t.TaskId = (SELECT TOP 1 t2.TaskId FROM Tasks t2 WHERE t2.JobId = j.JobId AND t2.DueDate > GETUTCDATE())

Probably horribly inefficient, but.. yay! :)
Is there a best-practice way to do this in SQL 2000?

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2011-08-01 : 15:24:26

Every SQL forum or newsgroup expects that you will post DDL, sample data and clear specifications which might might include output. This is explained in the FAQ section. We can not read minds; what would we need to know to do your job for you?

Tables must have keys and should have DRI, constraints, and all the basic features of a schema. You should know use ISO-8601 Standards for temporal data, avoid needless dialect, basic data modeling and use ISO-11179 Standards for data element names. Please tell us if you can change the DDL.

>> I have a table of "jobs", each of which can have many "tasks" assigned to it. Each task has a "due_date" field [sic: columns are not fields]. I know how to pull out a list of jobs and related tasks using

SELECT <job and task columns>
FROM Jobs AS J,
Tasks AS T
WHERE T.job_id = J.job_id;

>> But I need to show a list of jobs and only the *first upcoming* task related to it, if any. That is, "WHERE Tasks.due_date > GETUTCDATE()" but only the *first* one for each job. <<

There are problems with your spec. We often start or finish tasks in parallel. What do we do about task that are over_due? Without any DDL, I have to make guesses. Do you post the date a task is completed or not? I am assuming that the table you did not bother to post is

CREATE TABLE Tasks
(job_id INTEGER NOT NULL,
task_id INTEGER NOT NULL,
PRIMARY KEY (job_id, task_id),
due_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
completion_date DATE,
..);

WITH AS Unfinished_Tasks (job_id, task_id, due_date)
AS
(SELECT job_id, task_id, due_date
FROM Tasks
WHERE completion_date IS NULL)

SELECT U1.job_id, U1.task_id
FROM Unfinished_Tasks AS U1
WHERE U1.due_date
= (SELECT MIN(U2.due_date)
FROM Unfinished_Tasks AS U2
WHERE U1.job_id = U2.job_id);

--CELKO--
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-08-02 : 02:30:24

Eek. I thought this was called the "New to SQL" forum for a reason. :)

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-08-02 : 02:37:39

try this too


SELECT <job and task fields> FROM Jobs j 
LEFT OUTER JOIN (SELECT t1.* FROM Tasks t1
JOIN (SELECT JobId,MIN(taskID) AS First
     FROM Tasks
     WHERE DueDate > GETUTCDATE()
     GROUP BY JobId)t2
ON t2.JobId=t1.JobId
AND t2.First = t1.taskID)t
ON j.JobId = t.JobId

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-08-03 : 02:20:32

Thanks visakh, that's a new method to me, very interesting! I'll give it a go.

jcelko, thanks heaps for the pointers. I will include DDL next time. The reason I didn't go into much detail is I thought the "task/job" paradigm was very familiar, and for the sake of my problem I was only concentrating on that one column. There are constraints etc in the schema.

One question though: In the Tasks table, you have (job_id, task_id) as the primary index. I usually just leave the identity column (task_id) as primary and have the parent table column (job_id) as a separate index. Other indexes include it along with columns I will sort by, eg. (job_id, due_date). Is that not right?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-08-03 : 02:49:28

If tasks repeat for more than one job you cant make TaskID PK. In that case you've to go for composite PK

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-08-03 : 03:25:31

Ah I see, thanks visakh. I assume in that case TaskID is not an IDENTITY column? Otherwise you couldn't have the same task for more than one job. In my case it's an identity, only for one job, so I just leave it as PK.

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-08-03 : 05:05:24

Yep..You're correct

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

waveform
Yak Posting Veteran

93 Posts

Posted - 2011-08-03 : 05:28:12

Thanks again visakh. Out of curiosity - if a table is not using an IDENTITY column, how does one give a new ID to a new record? I mean if you're determining the new ID manually, from the last ID in the table, what stops two simultaneous INSERTs ending up with the same ID? Is there a special way of doing that sort of thing?

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2011-08-03 : 05:47:51

if its not using IDENTITY. you need to have a generate function which generates the next id based on sequence. also you need to make its not getting executed parallely except if inside a separate transaction.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources