Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 SQL help needed!

Author  Topic 

Renogen
Starting Member

6 Posts

Posted - 2011-04-30 : 04:42:45
I have a schema:
PLAY_SONG(uid, songid, datetime)

How do I find the uid pairs (uid1, uid2) in which user uid1 has played all songs played by a different user uid2 and user uid2 has played at least 10 distinct songs?

I only came out with

SELECT uid FROM PLAY_SONG GROUP BY uid HAVING COUNT(DISTINCT songid) > 9


which returns uid of users who have at least 10 distinct songs. No idea how to continue.

Please help, thanks in advance =D

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-04-30 : 08:44:25
Are you on SQL 2005 or later? If so, this would work - it may not be the most efficient, but logically it seems simple.
select distinct
t1.uid,
t3.uid
from
PLAY_SONG t1
inner join PLAY_SONG t3 on t3.uid <> t1.uid
where
exists (select * from PLAY_SONG t2 where t2.uid = t1.uid having COUNT(*) > 9)
and not exists
(
select songid from PLAY_SONG t4 where t4.uid = t1.uid
except
select songid from PLAY_SONG t5 where t5.uid = t3.uid
)
order by
t1.uid,
t3.uid
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-04-30 : 08:53:02
Created a test table for the query I posted above - but I changed the number of songs from 10 to 3 in the query for testing.
create table PLAY_SONG (uid int, songid int)

insert into PLAY_SONG values
(1,1),(1,2), (1,3),
(2,1), (2,2), (2,5),
(3,1),(3,2), (3,5),
(4,1),(4,2),
(5,1),(5,2),(5,3), (5,5);

-- insert query from my previous post here but change 9 to 2

I don't know if you care about it or not, but if two uids have exactly same songid's, that pair of uids is repeated in the results. In the sample data, uids 2 and 3 will be listed as 2, 3 and then as 3,2.
Go to Top of Page

Renogen
Starting Member

6 Posts

Posted - 2011-04-30 : 10:18:31
Hi, thanks for the prompt reply.

my table actually have all 3 as a composite primary key.

ur sql returned me tuples that doesn't seems to relate based on my table.

I have table smth like the following (i will skip the timestamp):
(uid, songid)

(1, {1 to 10}), (2, {1 to 10}, (3, {1 to 20}) and some others insignificant rows.
{1 to 10} meaning (2, 1), (2, 2), (2, 3) and so on.


So by right, i'm expecting to get something returned like (uid1, uid2)
(1, 2)
(2, 1) <--if can skip this will be better
(3, 1)
(3, 2)
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-04-30 : 13:06:36
Based on the sample data that you posted, the query I posted does seem to give the data as you described. To remove the dups, I added some more code. If you run this, it should give you exactly the result that you have in your latest post.
with cte as
(
select distinct
t3.uid uid1,
t1.uid uid2,
rank() over (order by t3.uid) as rn
from
PLAY_SONG t1
inner join PLAY_SONG t3 on t3.uid <> t1.uid
where
exists (select * from PLAY_SONG t2 where t2.uid = t1.uid having COUNT(*) > 2)
and not exists
(
select songid from PLAY_SONG t4 where t4.uid = t1.uid
except
select songid from PLAY_SONG t5 where t5.uid = t3.uid
)
)
select
uid1,
uid2
from
cte c1
where
not exists
(
select * from cte c2
where c2.uid1 = c1.uid2 and c2.uid2 = c1.uid1 and c2.rn < c1.rn
)
order by
uid1,
uid2
Go to Top of Page

Renogen
Starting Member

6 Posts

Posted - 2011-05-01 : 00:51:36
thanks for your help and effort, but it is still not returning me the correct tuples that i need. uid2 should be users that has at least 10 distinct songs. and uid1 should have played the same songs as uid2 or more. which is to say, uid1's songs should be the superset of uid2.

from your sql, it returns me uid1 and uid2 with tuples that has less than 10 songs. (i've changed the count to > 10 instead of 2)
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-01 : 07:46:18
Renogen, I don't know what the problem is. The only thing that comes to mind is that you may have same player+song combination multiple times in your table. If that is the case, the updated script below should fix it.

If the script below does not give you the right answers, it needs someone who has more expertise than I. People may not respond to this thread because they see several replies and would think that the question has been answered. So can you re-post your question, but also include some sample data? Brett's post here would help in posting sample data: http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Anyway, here is my final attempt. And sorry I couldn't be of more help.
--- CREATE SAMPLE TABLE AND DATA ---------

--- table
create table PLAY_SONG (uid int, songid int)

--- sample data. uid 4 is has repeated songid 1, so that should
-- not appear in the results.
insert into PLAY_SONG values
(1,1),(1,2), (1,3), (1,4),(1,5),(1,6),(1,7), (1,8), (1,9), (1,10),
(2,1),(2,2), (2,3), (2,4),(2,5),(2,6),(2,7), (2,8), (2,9), (2,10),
(3,1),(3,2), (3,3), (3,4),(3,5),(3,6),(3,7), (3,8), (3,9), (3,10),
(3,11),(3,12), (3,13), (3,14),(3,15),(3,16),(3,17), (3,18), (3,19), (3,20),
(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1),(4,1);

--- QUERY --------- --------- -----------

-- the first CTE picks distinct uid+song combinations.
with DISTINCT_PLAY_SONG as
(
select distinct uid,songid from PLAY_SONG
)
-- the second cte has the logic to pick uids who have
-- a) played all songs played by another uid AND
-- b) only if that second uid has played at least 10 songs.
,cte as
(
select distinct
t3.uid uid1,
t1.uid uid2,
rank() over (order by t3.uid) as rn
from
DISTINCT_PLAY_SONG t1
inner join DISTINCT_PLAY_SONG t3 on t3.uid <> t1.uid
where
exists (select * from DISTINCT_PLAY_SONG t2 where t2.uid = t1.uid having COUNT(*) > 9)
and not exists
(
select songid from DISTINCT_PLAY_SONG t4 where t4.uid = t1.uid
except
select songid from DISTINCT_PLAY_SONG t5 where t5.uid = t3.uid
)
)
-- the final select is simply presenting the data from the
-- previous CTE, except, it is removing duplicate permutations.
select
uid1,
uid2
from
cte c1
where
not exists
(
select * from cte c2
where c2.uid1 = c1.uid2 and c2.uid2 = c1.uid1 and c2.rn < c1.rn
)
order by
uid1,
uid2;

BTW, the insert statement for the sample data would work only if you are on SQL 2008.
Go to Top of Page

Renogen
Starting Member

6 Posts

Posted - 2011-05-01 : 11:09:08
Thanks alot, it came out with the exactly how I wanted it. now i have to study your query for my exams. Thanks once again =D
Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2011-05-01 : 13:38:14
quote:
Originally posted by Renogen

now i have to study your query for my exams.

I had no idea!!
Go to Top of Page

Renogen
Starting Member

6 Posts

Posted - 2011-05-01 : 21:54:47
haha, that question came out in my past yr papers, LOL. it was one of the tricky qns, I will discuss with my classmates and see if we can refine it abit, lol =D

cheers!
Go to Top of Page

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-05-01 : 22:15:24
maybe

select uid2 = p1.uid, uid1 = p2.uid
from PLAY_SONG p1
left join PLAY_SONG p2
on p1.uid <> p2.uid
and p1.songid = p2.songid
group by p1.uid, p2.uid
having count(distinct p1.songid) >= 10
and count(distinct p1.songid) = count(distinct p2.songid)

You could turn p1 and p2 into derived tables and get distinct songids per uid.
If you do that then you wouldn't need the distincts in the group by..

select uid2 = p1.uid, uid1 = p2.uid
from (select distinct uid, songid from PLAY_SONG) p1
left join (select distinct uid, songid from PLAY_SONG) p2
on p1.uid <> p2.uid
and p1.songid = p2.songid
group by p1.uid, p2.uid
having count(p1.songid) >= 10
and count(p1.songid) = count(p2.songid)


==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

Renogen
Starting Member

6 Posts

Posted - 2011-05-03 : 00:43:13
quote:
Originally posted by nigelrivett

maybe

select uid2 = p1.uid, uid1 = p2.uid
from PLAY_SONG p1
left join PLAY_SONG p2
on p1.uid <> p2.uid
and p1.songid = p2.songid
group by p1.uid, p2.uid
having count(distinct p1.songid) >= 10
and count(distinct p1.songid) = count(distinct p2.songid)

You could turn p1 and p2 into derived tables and get distinct songids per uid.
If you do that then you wouldn't need the distincts in the group by..

select uid2 = p1.uid, uid1 = p2.uid
from (select distinct uid, songid from PLAY_SONG) p1
left join (select distinct uid, songid from PLAY_SONG) p2
on p1.uid <> p2.uid
and p1.songid = p2.songid
group by p1.uid, p2.uid
having count(p1.songid) >= 10
and count(p1.songid) = count(p2.songid)


==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.



Your sql returns me values like (1, 2) and (2, 1). Anyways to make only 1 appear since they are same if you switch the column ard.
Go to Top of Page

nigelrivett
Master Smack Fu Yak Hacker

3385 Posts

Posted - 2011-05-03 : 03:54:28
change p1.uid <> p2.uid
to
p1.uid < p2.uid

==========================================
Cursors are useful if you don't know sql.
SSIS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page
   

- Advertisement -