SQL help needed! - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Programming

SQL help needed!

Author

Topic

Renogen
Starting Member

6 Posts

Posted - 2011-04-30 : 04:42:45

I have a schema:
PLAY_SONG(uid, songid, datetime)

How do I find the uid pairs (uid1, uid2) in which user uid1 has played all songs played by a different user uid2 and user uid2 has played at least 10 distinct songs?

I only came out with


SELECT uid FROM PLAY_SONG GROUP BY uid HAVING COUNT(DISTINCT songid) > 9

which returns uid of users who have at least 10 distinct songs. No idea how to continue.

Please help, thanks in advance =D

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-04-30 : 08:44:25

Are you on SQL 2005 or later? If so, this would work - it may not be the most efficient, but logically it seems simple.

select distinct
   t1.uid,
   t3.uid
from
   PLAY_SONG t1
   inner join PLAY_SONG t3 on t3.uid <> t1.uid
where
   exists (select * from PLAY_SONG t2 where t2.uid = t1.uid having COUNT(*) > 9)
   and not exists
   (
      select songid from PLAY_SONG t4 where t4.uid = t1.uid
      except
      select songid from PLAY_SONG t5 where t5.uid = t3.uid
   )
order by
   t1.uid,
   t3.uid

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-04-30 : 08:53:02

Created a test table for the query I posted above - but I changed the number of songs from 10 to 3 in the query for testing.

create table PLAY_SONG (uid int, songid int)
 
insert into PLAY_SONG values
(1,1),(1,2), (1,3), 
(2,1), (2,2), (2,5), 
(3,1),(3,2), (3,5), 
(4,1),(4,2),
(5,1),(5,2),(5,3), (5,5);

-- insert query from my previous post here but change 9 to 2

I don't know if you care about it or not, but if two uids have exactly same songid's, that pair of uids is repeated in the results. In the sample data, uids 2 and 3 will be listed as 2, 3 and then as 3,2.

Renogen
Starting Member

6 Posts

Posted - 2011-04-30 : 10:18:31

Hi, thanks for the prompt reply.

my table actually have all 3 as a composite primary key.

ur sql returned me tuples that doesn't seems to relate based on my table.

I have table smth like the following (i will skip the timestamp):
(uid, songid)

(1, {1 to 10}), (2, {1 to 10}, (3, {1 to 20}) and some others insignificant rows.
{1 to 10} meaning (2, 1), (2, 2), (2, 3) and so on.

So by right, i'm expecting to get something returned like (uid1, uid2)
(1, 2)
(2, 1) <--if can skip this will be better
(3, 1)
(3, 2)

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-04-30 : 13:06:36

Based on the sample data that you posted, the query I posted does seem to give the data as you described. To remove the dups, I added some more code. If you run this, it should give you exactly the result that you have in your latest post.

with cte as
(
   select distinct
      t3.uid uid1,
      t1.uid uid2,
      rank() over (order by t3.uid) as rn
   from
      PLAY_SONG t1
      inner join PLAY_SONG t3 on t3.uid <> t1.uid
   where
      exists (select * from PLAY_SONG t2 where t2.uid = t1.uid having COUNT(*) > 2)
      and not exists
      (
         select songid from PLAY_SONG t4 where t4.uid = t1.uid
         except
         select songid from PLAY_SONG t5 where t5.uid = t3.uid
      )
   )
select
  uid1,
  uid2
from
   cte c1
where
   not exists 
   (  
      select * from cte c2 
      where c2.uid1 = c1.uid2 and c2.uid2 = c1.uid1 and c2.rn < c1.rn
   )
order by
   uid1,
   uid2

Renogen
Starting Member

6 Posts

Posted - 2011-05-01 : 00:51:36

thanks for your help and effort, but it is still not returning me the correct tuples that i need. uid2 should be users that has at least 10 distinct songs. and uid1 should have played the same songs as uid2 or more. which is to say, uid1's songs should be the superset of uid2.

from your sql, it returns me uid1 and uid2 with tuples that has less than 10 songs. (i've changed the count to > 10 instead of 2)

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-01 : 07:46:18

Renogen, I don't know what the problem is. The only thing that comes to mind is that you may have same player+song combination multiple times in your table. If that is the case, the updated script below should fix it.

If the script below does not give you the right answers, it needs someone who has more expertise than I. People may not respond to this thread because they see several replies and would think that the question has been answered. So can you re-post your question, but also include some sample data? Brett's post here would help in posting sample data: http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Anyway, here is my final attempt. And sorry I couldn't be of more help.

--- CREATE SAMPLE TABLE AND DATA ---------

--- table
create table PLAY_SONG (uid int, songid int)

--- sample data. uid 4 is has repeated songid 1, so that should
-- not appear in the results.
insert into PLAY_SONG values
(1,1),(1,2), (1,3), (1,4),(1,5),(1,6),(1,7), (1,8), (1,9), (1,10),
(2,1),(2,2), (2,3), (2,4),(2,5),(2,6),(2,7), (2,8), (2,9), (2,10),
(3,1),(3,2), (3,3), (3,4),(3,5),(3,6),(3,7), (3,8), (3,9), (3,10),
(3,11),(3,12), (3,13), (3,14),(3,15),(3,16),(3,17), (3,18), (3,19), (3,20),
(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1);

--- QUERY  --------- --------- -----------

-- the first CTE picks distinct uid+song combinations.
with DISTINCT_PLAY_SONG as
(
   select distinct uid,songid from PLAY_SONG
)
-- the second cte has the logic to pick uids who have
-- a) played all songs played by another uid AND
-- b) only if that second uid has played at least 10 songs.
,cte as
(
   select distinct
      t3.uid uid1,
      t1.uid uid2,
      rank() over (order by t3.uid) as rn
   from
      DISTINCT_PLAY_SONG t1
      inner join DISTINCT_PLAY_SONG t3 on t3.uid <> t1.uid
   where
      exists (select * from DISTINCT_PLAY_SONG t2 where t2.uid = t1.uid having COUNT(*) > 9)
      and not exists
      (
         select songid from DISTINCT_PLAY_SONG t4 where t4.uid = t1.uid
         except
         select songid from DISTINCT_PLAY_SONG t5 where t5.uid = t3.uid
      )
   )
-- the final select is simply presenting the data from the
-- previous CTE, except, it is removing duplicate permutations.
select
  uid1,
  uid2
from
   cte c1
where
   not exists 
   (  
      select * from cte c2 
      where c2.uid1 = c1.uid2 and c2.uid2 = c1.uid1 and c2.rn < c1.rn
   )
order by
   uid1,
   uid2;

BTW, the insert statement for the sample data would work only if you are on SQL 2008.

Renogen
Starting Member

6 Posts

Posted - 2011-05-01 : 11:09:08

Thanks alot, it came out with the exactly how I wanted it. now i have to study your query for my exams. Thanks once again =D

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-01 : 13:38:14

quote:
Originally posted by Renogen

now i have to study your query for my exams.

I had no idea!!

Renogen
Starting Member

6 Posts

Posted - 2011-05-01 : 21:54:47

haha, that question came out in my past yr papers, LOL. it was one of the tricky qns, I will discuss with my classmates and see if we can refine it abit, lol =D

cheers!

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-05-01 : 22:15:24

maybe

select uid2 = p1.uid, uid1 = p2.uid
from PLAY_SONG p1
left join PLAY_SONG p2
on p1.uid <> p2.uid
and p1.songid = p2.songid
group by p1.uid, p2.uid
having count(distinct p1.songid) >= 10
and count(distinct p1.songid) = count(distinct p2.songid)

You could turn p1 and p2 into derived tables and get distinct songids per uid.
If you do that then you wouldn't need the distincts in the group by..

select uid2 = p1.uid, uid1 = p2.uid
from (select distinct uid, songid from PLAY_SONG) p1
left join (select distinct uid, songid from PLAY_SONG) p2
on p1.uid <> p2.uid
and p1.songid = p2.songid
group by p1.uid, p2.uid
having count(p1.songid) >= 10
and count(p1.songid) = count(p2.songid)

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.

Renogen
Starting Member

6 Posts

Posted - 2011-05-03 : 00:43:13

quote:
Originally posted by nigelrivett

maybe

select uid2 = p1.uid, uid1 = p2.uid
from PLAY_SONG p1
left join PLAY_SONG p2
on p1.uid <> p2.uid
and p1.songid = p2.songid
group by p1.uid, p2.uid
having count(distinct p1.songid) >= 10
and count(distinct p1.songid) = count(distinct p2.songid)

You could turn p1 and p2 into derived tables and get distinct songids per uid.
If you do that then you wouldn't need the distincts in the group by..

select uid2 = p1.uid, uid1 = p2.uid
from (select distinct uid, songid from PLAY_SONG) p1
left join (select distinct uid, songid from PLAY_SONG) p2
on p1.uid <> p2.uid
and p1.songid = p2.songid
group by p1.uid, p2.uid
having count(p1.songid) >= 10
and count(p1.songid) = count(p2.songid)

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.

Your sql returns me values like (1, 2) and (2, 1). Anyways to make only 1 appear since they are same if you switch the column ard.

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-05-03 : 03:54:28

change p1.uid <> p2.uid
to
p1.uid < p2.uid

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources