Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Select Distinct Not Working

Author  Topic 

meberg66219
Yak Posting Veteran

65 Posts

Posted - 2010-11-08 : 15:45:17
I have a query written to access data from a view with a min date. I wrote the original view with a select distinct for the ADEnrollID table, however the query is returning duplicate ADEnrollID.

What I need to return is the the min date per ADEnrollID.


SELECT DISTINCT mb.SyStudentID, mb.LastName, mb.FirstName, mb.EnrollDate, mb.SchedStartDate, mb.Descrip, MB.ADEnrollID
FROM dbo.MB_CompareSchedStartDate1 AS mb INNER JOIN
(SELECT ADEnrollID, MIN(SchedStartDate) AS mindate
FROM dbo.MB_CompareSchedStartDate1
GROUP BY LastName, AdEnrollID) date ON
mb.AdEnrollID=date.AdEnrollID AND
mb.SchedStartDate = date.mindate

Here is what I'm getting as a return.

FirstName EnrollDate Schedstartdate Descrip ADEnroll ID
Bob 2008-01-31 2008-01-31 Strategy 16808
Bob 2008-01-31 2008-01-31 Windows 16808
Bob 2009-05-04 2009-05-04 Math 17717

What I want is just the first course taken for each AdEnroll ID.

I would have thought the Select Distinct would only return the top ID with the Min date.

The primary view that the data is pulling from is written as so:

SELECT DISTINCT
TOP (100) PERCENT dbo.syStudent.SyStudentId, dbo.syStudent.LastName, dbo.syStudent.FirstName, dbo.AdEnroll.ExpStartDate,
dbo.AdEnrollSched.StartDate AS SchedStartDate, dbo.AdEnroll.EnrollDate, dbo.AdEnrollSched.Descrip, dbo.AdEnroll.AdEnrollID
FROM dbo.syStudent LEFT OUTER JOIN
dbo.AdEnroll ON dbo.syStudent.SyStudentId = dbo.AdEnroll.SyStudentID LEFT OUTER JOIN
dbo.AdEnrollSched ON dbo.AdEnroll.AdEnrollID = dbo.AdEnrollSched.AdEnrollID
WHERE (dbo.AdEnrollSched.StartDate IS NOT NULL) AND (dbo.AdEnroll.ExpStartDate IS NOT NULL) AND
(dbo.AdEnroll.ExpStartDate <> dbo.AdEnrollSched.StartDate) AND (dbo.AdEnroll.EnrollDate <> dbo.AdEnroll.ExpStartDate)
ORDER BY dbo.syStudent.LastName, dbo.syStudent.SyStudentId

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-11-08 : 18:23:59
Which of these rows should be selected? And what logic do you apply to determine how that row is selected?

Bob 2008-01-31 2008-01-31 Strategy 16808
Bob 2008-01-31 2008-01-31 Windows 16808

Also, it might be helpfull if we had DDL, DML and expected output so we can run queries and help you get a correct answer quicker. This link might help you provide that information:
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx
Go to Top of Page

meberg66219
Yak Posting Veteran

65 Posts

Posted - 2010-11-09 : 08:29:12
Basically,

The logic is if 2008/01/31 is the min date and for an ID number 16808 then ignore the other records. However I am not sure how to accomplish this task.

Go to Top of Page

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-11-09 : 11:40:02
You should be able to make use of a RANKing function. If you need help implementing one of those, please follow the link I posted above and provide DDL, DML and expected output.
Go to Top of Page

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2010-11-09 : 14:24:37
1) SELECT DISTINCT removes redundant duplicate rows

2) Do not use reserved words like DATE for data element _names. It is too damn vague anyway.

3) One of the symptoms of a bad query is the same table appearing more than once in it.


SELECT DISTINCT MB.systudent_id, MB.last_name, MB.first_name, MB.enroll_date, MB.sched_start_date, MB.something_descrip, MB.ad_enroll_id
FROM MB_Compare_Sched_Start_Date1 AS MB
INNER JOIN
(SELECT ad_enroll_id, MIN(sched_start_date) AS start_date_min
FROM MB_Compare_Sched_Start_Date1
GROUP BY last_name, ad_enroll_id) AS something_date

ON
MB.ad_enroll_id=_something_date.ad_enroll_id
AND MB.schedstart_date = something_date.start_date_min

>> here is what I’m getting as a return. <<

Which is exactly what you asked for.

>> I would have thought the SELECT DISTINCT would only return the top ID with the Min something_date. <<

Why did you think that? People seldom make random errors; they usual have a misconception or the wrong mindset that leads to the error. You don't understand what a set is AND wanted to see a sequential ordering. The way that you wrote "SELECT DISTINCT TOP (100) PERCENT..ORDER BY.." is a serious symptom. VIEWs do not get ORDER BY clauses AND we don't use TOP in them.

Here is a short untested re-write with a CTE. The trick is using an aggregate function with a window clause to find the first sched_start_date

WITH X
AS
(SELECT systudent_id, last_name, first_name, enroll_date,
sched_start_date, something_descrip, ad_enroll_id,
MIN(sched_start_date) OVER (PARTITION BY systudent_id) AS start_date_min
FROM MB_Compare_Sched_Start_Date1 )

SELECT X.*
FROM X
WHERE X.start_date_min = X.sched_start_date;


--CELKO--
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Go to Top of Page
   

- Advertisement -