Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 how to get two rows during 2 tables join

Author  Topic 

Sny
Starting Member

4 Posts

Posted - 2011-05-21 : 16:51:09
Hi,

I have to find two most similar readers for every zip code but I can not know how to do it. my tables;

book(bookid, title, author, category )
reader(readerid, name, address, zip)
action(actionid,date, type, bookid, readerid)


SELECT *
FROM (SELECT zip FROM reader GROUP BY zip) t1,

(SELECT a.readerid,b.readerid compare_readerid,a.zip ZipFirstReader,count(*) booksInCommon
FROM (SELECT r.readerid,a.bookid,r.zip
FROM reader r,action a
WHERE a.readerid = r.readerid and a.typex='B'
GROUP BY r.readerid,a.bookid,r.zip) a ,
(SELECT r.readerid,a.bookid,r.zip
FROM reader r,action a
WHERE a.readerid = r.readerid and a.typex='B'
GROUP BY r.readerid,a.bookid,r.zip) b
WHERE a.bookid = b.bookid and a.readerid <> b.readerid
GROUP BY a.readerid,b.readerid,a.zip) t2
--order by booksInCommon DESC) t2

Thank you

Andrew Zwicker
Starting Member

11 Posts

Posted - 2011-05-22 : 03:23:20
Hi,

The following query should help. It looks like you are on the right track in terms of the query you have - it is using zip as a baseline and going from there to do the comparisons between readers.

For the solution, here's some data to work with:

create table book(bookid int, title varchar(30), author varchar(20), category varchar(50))
create table reader(readerid int, [name] varchar(30), address varchar(70), zip int)
create table action(actionid int, date datetime, type varchar(10), bookid int, readerid int)

insert into book values(1, 'Book 1', 'Joe Smith', 'Fiction')
insert into book values(2, 'Book 2', 'Joe Smith', 'NonFiction')
insert into book values(3, 'Book 3', 'Greg Jones', 'Fiction')
insert into book values(4, 'Book 4', 'Jane Gomes', 'Fiction')
insert into reader values(1, 'Alex', '22 Green St',11111)
insert into reader values(2, 'Bill', '4 Stone Street',11111)
insert into reader values(3, 'Sue', '6 Main Street',53344)
insert into reader values(4, 'Sue', '2 Hamilton Road',11111)
insert into reader values(5, 'Mark', '2 Rocky Avenue',53344)
insert into reader values(6, 'Sam', '22 Blue St',53344)
insert into reader values(7, 'Tom', '4 Hammersmith Street',53344)
insert into reader values(8, 'John', '6 Highland Street',53344)
insert into reader values(9, 'Greg', '2 Ocean Road',11111)
insert into reader values(10, 'Bobby', '2 Hammer Avenue',11111)

insert into action values(3, '1/8/2011', 'B', 3, 2)
insert into action values(4, '1/10/2011', 'B', 4, 1)
insert into action values(5, '1/10/2011', 'B', 4, 2)
insert into action values(6, '1/1/2011', 'B', 1, 5)
insert into action values(7, '1/4/2011', 'B', 2, 6)
insert into action values(8, '1/8/2011', 'B', 2, 7)
insert into action values(9, '1/10/2011', 'B', 4, 8)
insert into action values(10, '1/10/2011', 'B', 1, 9)
insert into action values(11, '1/10/2011', 'B', 3, 3)
insert into action values(12, '1/10/2011', 'B', 2, 4)
insert into action values(13, '1/1/2011', 'B', 1, 5)
insert into action values(14, '1/4/2011', 'B', 2, 1)
insert into action values(15, '1/8/2011', 'B', 1, 6)
insert into action values(16, '1/10/2011', 'B', 4, 1)
insert into action values(15, '1/8/2011', 'B', 2, 5)

With this in place, I decided to break up the solution into 2 parts [see below for the solution]. The first gets a list of, for each pair that has a match, how many matches there are. This is similar to the query you have, although I put in a > restriction in the join instead of <> for the readerid to prevent duplicates (for example, [reader A, reader B] is a duplicate of [reader B, reader A] ).

For the second part I use a derived table to get the maximum number of matches for each zipcode, and then grab the needed information for displaying the reader information.

One minor [potential] issue that remains is that, if 2+ pairs of readers have the same number of common books, they'll both show up on the list. It is just something to be aware of.

Also, note that the joins below use the 'on abc=xyz' type syntax and the joins aren't done using simply commas. As queries get more complex, I think it is easier to follow/read the query using the 'on' type of join syntax.

select ZipList.zip, reader1.readerid as 'r1', reader2.readerid as 'r2', count(*) as 'NumBookMatches'
into #NumMatchesInfo
From
(
select distinct zip from reader
) ZipList
inner join reader reader1 on reader1.zip = ZipList.zip
inner join reader reader2 on reader2.zip = ZipList.zip and reader1.readerid > reader2.readerid
inner join action action1 on action1.readerid = reader1.readerid and action1.type='B'
inner join action action2 on action2.readerid = reader2.readerid and action1.type='B' and action2.bookid = action1.bookid
group by ZipList.zip, reader1.readerid, reader2.readerid


select MatchCount.zip, reader1.name, reader2.name From #NumMatchesInfo matchList
inner join
(
select zip, max(NumBookMatches) as 'NumBookMatches'
from #NumMatchesInfo
group by zip
) MatchCount on MatchCount.NumBookMatches = matchList.NumBookMatches
inner join reader reader1 on reader1.readerid = matchList.r1
inner join reader reader2 on reader2.readerid = matchList.r2


drop table #NumMatchesInfo

Based on the question, some other things you might want to consider [if you want to build this query out even more] are:

1) Do you need to take into consideration the values of the type field? The example only has 'B', but what do the other codes represent? Might you need/want to match for other types?
2) You might want to consider readers who read books from the same author to some degree even if the books they read are different.
3) You might want to consider readers who read books from the same book category to some degree as being similar. 2 readers who read books in the same category, even if different, would be considered different than readers who read books in the same category.

I hope this helps.

- Andrew Zwicker

Visit http://www.helpwithsql.com
Go to Top of Page

MichaelJSQL
Constraint Violating Yak Guru

252 Posts

Posted - 2011-05-22 : 10:20:32
This was fun -- I think this should work, but it's early and I haven't had coffee yet.

I used Andrew's tables as a test. Thanks Andrew

;With MyBooks
as
(
SELECT DISTINCT reader.zip,Reader.readerid, reader.name, book.bookid
FROM reader INNER JOIN
action ON reader.readerid = action.readerid INNER JOIN
book ON action.bookid = book.bookid
)
,MyResults
as
(
SELECT R1.zip,R1.name,Row_Number() Over(Partition by R1.zip order by COUNT(1) DESC) RwId,COUNT(1) cnt
FROM MyBooks R1 Inner join MyBooks R2
ON R1.BOOKID = R2.BOOKID and R1.readerid !=R2.readerid AND R1.zip= R2.zip
GROUP BY R1.zip,R1.name
)

Select zip,Name from MyResults Where RwId <=2

based on the data:

In zip code 53344 Mark and Sam are the most similar readers
In zip code 11111 Alex could have been similar to Bill or Sue as it is a tie. I am not sure how you want to break ties.

I did a test by adding another book for Sue
insert into action values(15, '1/11/2011', 'B', 4,4)

The tie was then broken and Sue became more similar to Alex

Go to Top of Page

Sny
Starting Member

4 Posts

Posted - 2011-05-22 : 13:41:18
Thank you so much for solution. I will try it ,i think it is a good solution..
Go to Top of Page
   

- Advertisement -