Author |
Topic |
velvettiger
Posting Yak Master
115 Posts |
Posted - 2008-04-24 : 14:54:17
|
Hi,I am writing a query that is suppose to produce the sum of all transactions for each customer and also output the subsidiary that the customer spent the most money at.The persons card number must begin with 2525250599 and these customers would have had to make a transaction in the month of Feb.This is what I came up with so far. It works but I find that it isn't producing the store that the customer spent the most money at in every case and also i am suppose to get a result set of about 34,949 but keep getting a total value of about 34,965 but I am thinking that a person may appear multiple times if they had the same maximum spend in 2 or more different stores.This would output 2 or more entries of this customers total spend and then it would show the different subsidiaries.Select distinct c.cardnumber ,sum(DailyTransactionValue)TotalSpend ,SubsidiaryWithMostSpendfrom customer c inner join SubsidiaryByCustomerByClassTransactionDetail ss on c.cardnumber=ss.cardnumber inner join ( SELECT distinct CardNumber ,Subsidiaryid as SubsidiaryWithMostSpend FROM SubsidiaryByCustomerByClassTransactionDetail t WHERE DailyTransactionValue = ( SELECT MAX(DailyTransactionValue) FROM SubsidiaryByCustomerByClassTransactionDetail WHERE CardNumber = t.CardNumber and TransactionDate >= { ts '2008-02-01 00:00:00.000' } And TransactionDate < { ts '2008-03-01 00:00:00'} ) and cardnumber like '%2525250599%' group by cardnumber,Subsidiaryid )data on data.cardnumber=c.cardnumberwhere TransactionDate >= { ts '2008-02-01 00:00:00.000' } And TransactionDate < { ts '2008-03-01 00:00:00'} and c.cardnumber like '%2525250599%' group by c.cardnumber ,SubsidiaryWithMostSpend Any input is greatly appreciated.Thanks alot guys |
|
cat_jesus
Aged Yak Warrior
547 Posts |
Posted - 2008-04-24 : 15:10:18
|
You can take this working query and use it to find any duplicates you have on Cardnumber by adding this to the topselect sl1.cardnumber, count(*)from (and this to the bottom) as sl1group by sl1.cardmemberhaving count(*) > 1That should help you track down your descrepancy and based on your code, I'd say you're right a person could theoretically spend the same amount in two locations and come up twice in this query. |
 |
|
velvettiger
Posting Yak Master
115 Posts |
Posted - 2008-04-24 : 15:21:24
|
Thanks alot for your input but I also want to know if this query looks correct for what I said I was trying to do. |
 |
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2008-04-24 : 15:31:31
|
If you want the persons card number must begin with 2525250599, the change and c.cardnumber like '%2525250599%'to and c.cardnumber like '2525250599%'Also,Is the DailyTransactionValue in the SubsidiaryByCustomerByClassTransactionDetail table an aggregate by day? Or do you actually need to sum that value by day to determine which subsidiary the customer spent the most money at? |
 |
|
velvettiger
Posting Yak Master
115 Posts |
Posted - 2008-04-24 : 15:41:59
|
Thanks I didn't even notice that.To you guys do you think that this middle section is pulling the subsidiaries that customers spend the most money at?ELECT distinct CardNumber ,Subsidiaryid as SubsidiaryWithMostSpend FROM SubsidiaryByCustomerByClassTransactionDetail t WHERE DailyTransactionValue = ( SELECT MAX(DailyTransactionValue) FROM SubsidiaryByCustomerByClassTransactionDetail WHERE CardNumber = t.CardNumber and TransactionDate >= { ts '2008-02-01 00:00:00.000' } And TransactionDate < { ts '2008-03-01 00:00:00'} ) and cardnumber like '2525250599%' group by cardnumber,Subsidiaryid |
 |
|
cat_jesus
Aged Yak Warrior
547 Posts |
Posted - 2008-04-24 : 15:52:56
|
It looks like it should. It's not exactly how I would write it, but you do have the possibility that for any given cardnumber you could have two or more records returned if their transactionvalue is the same and max. What do you do in that case? Just pick one or use both? |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-24 : 16:10:33
|
Why are you invoking ODBC calls?SELECT t.CardNumber, t.SubsidiaryID AS SubsidiaryWithMostSpendFROM SubsidiaryByCustomerByClassTransactionDetail AS tWHERE t.DailyTransactionValue = ( SELECT MAX(d.DailyTransactionValue) FROM SubsidiaryByCustomerByClassTransactionDetail AS d WHERE d.CardNumber = t.CardNumber AND d.TransactionDate >= '20080201 AND d.TransactionDate < '20080301' ) AND t.CardNumber LIKE '2525250599%'GROUP BY t.CardNumber, t.SubsidiaryID E 12°55'05.25"N 56°04'39.16" |
 |
|
velvettiger
Posting Yak Master
115 Posts |
Posted - 2008-04-25 : 08:51:39
|
Hi Again,I made all the changes that everyone mention and i must say the running time has really improved. I have realized that when a person shops twice at the same/different store and spend the same amount, this person total spend is displayed along with all the subsidiaries that the customer shopped at. When it is suppose to show the Customers total spend and the one subsidiary that they spent the most money at. In the below example the result should be cardnumber spend store2525250599599490 130.00 5706Instead I am getting the below:- [CODE]2525250599599490 130.00 57132525250599599490 130.00 57062525250599599490 130.00 5709[/CODE]This is what the customer transaction details looks like [see below]cardnumber spend Date store2525250599599400 30.00 2008-02-05 00:00:00.000 57062525250599599400 20.00 2008-02-07 00:00:00.000 57062525250599599400 30.00 2008-02-08 00:00:00.000 57062525250599599400 25.00 2008-02-13 00:00:00.000 57062525250599599400 25.00 2008-02-02 00:00:00.000 5706 The above shows the persons transactions for FEB./***************************************************************************/ In another situation you may find a person may have the below transactions[code]cardnumber spend Date store2525250599599400 30.00 2008-02-05 00:00:00.000 57062525250599599400 20.00 2008-02-07 00:00:00.000 57062525250599599400 30.00 2008-02-08 00:00:00.000 57022525250599599400 25.00 2008-02-13 00:00:00.000 57062525250599599400 25.00 2008-02-02 00:00:00.000 5706 In which case this should be the result[code]cardnumber spend store2525250599599490 130.00 57062525250599599400 30.00 5702 but instead I am getting the below results[CODE]2525250599599490 130.00 57132525250599599490 130.00 57062525250599599490 130.00 57092525250599599490 130.00 5702[/CODE]NOTE:PLEASE NOT THAT ANY IDS THAT ARE NOT DISPLAYED IN THE TRANSACTIONS TABLE SHOWN HERE ARE STORES THAT CUSTOMERS SHOPPED BEFORE FEB,08.NOTE AGAIN: I AM USING ONLY DATA FRO FEB |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-25 : 09:21:24
|
Maybe you should look at this with a different angle?This approach gives you a flexible solution, both with cardnumbers and dates.-- Prepare sample data. Only for mimic original poster's environment.DECLARE @Sample TABLE (CardNumber CHAR(16), StoreID INT, Time DATETIME, Amount MONEY)INSERT @SampleSELECT '2525250599112233', 1, '20080204', 200 UNION ALLSELECT '2525250599112233', 3, '20080207', 20 UNION ALLSELECT '2525250599024680', 1, '20080217', 100 UNION ALLSELECT '2525250599123456', 2, '20080211', 300DECLARE @Stores TABLE (StoreID INT, StoreName VARCHAR(40))INSERT @StoresSELECT 1, 'New York' UNION ALLSELECT 2, 'Bjuv' UNION ALLSELECT 3, 'London'-- Stage the intermediate result. The solution starts here.SELECT IDENTITY(INT, 1, 1) AS RecID, CardNumber, StoreID, SUM(Amount) AS Amount, CAST(NULL AS INT) AS SeqINTO #StageFROM @SampleWHERE CardNumber LIKE '2525250599%' AND Time >= '20080201' AND Time < '20080301'GROUP BY CardNumber, StoreIDORDER BY CardNumber, -- Per CardNumber SUM(Amount) DESC, -- Per amount descending COUNT(*) DESC -- If there is a tie with spent amount, choose the store with most transactions-- Find out which store is most spent by creating a sequenceUPDATE sSET s.Seq = s.RecID - e.minIDFROM #Stage AS sINNER JOIN ( SELECT CardNumber, MIN(RecID) AS minID FROM #Stage GROUP BY CardNumber ) AS e ON e.CardNumber = s.CardNumber-- Show the final resultSELECT s.CardNumber, MAX(w.StoreName) AS StoreMostSpent, SUM(s.Amount) AS TotalSpentFROM #Stage AS sLEFT JOIN @Stores AS w ON w.StoreID = s.StoreID AND s.Seq = 0GROUP BY s.CardNumberORDER BY s.CardNumber--Clean upDROP TABLE #Stage E 12°55'05.25"N 56°04'39.16" |
 |
|
velvettiger
Posting Yak Master
115 Posts |
Posted - 2008-04-25 : 10:11:56
|
Hi Peso,I just tried the suggestion you made and well its still running but I made a change to the final result section SELECT s.CardNumber, MAX(w.SubsidiaryId) AS StoreMostSpent, SUM(s.Amount) AS TotalSpentFROM #Stage AS s inner JOIN subsidiaryByCustomerByClassTransactionDetail AS w ON w.SubsidiaryId = s.SubsidiaryId and s.Seq = 0 GROUP BY s.CardNumberORDER BY s.CardNumber because the table called subsidiaryByCustomerByClassTransactionDetail contains all customers over 200,000 ppl and I working on a relatively small set of data 34,000 so i made it an inner join so that i would capture only those from the #stage table who made transactions and not any additional people.I will post again with the results from the query when its finished running.Thanks alot |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-25 : 10:14:15
|
You are standing on wrong foot!The table "subsidiaryByCustomerByClassTransactionDetail" should be joined while INSERTing to the #stage table, not afterwards. E 12°55'05.25"N 56°04'39.16" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-25 : 10:18:53
|
Like this-- Stage the intermediate resultSELECT IDENTITY(INT, 1, 1) AS RecID, CardNumber, SubsidiaryID, SUM(DailyTransactionValue) AS Amount, CAST(NULL AS INT) AS SeqINTO #StageFROM SubsidiaryByCustomerByClassTransactionDetailWHERE CardNumber LIKE '2525250599%' AND TransactionDate >= '20080201' AND TransactionDate < '20080301'GROUP BY CardNumber, SubsidiaryIDORDER BY CardNumber, SUM(DailyTransactionValue) DESC, COUNT(*) DESC-- Find out which store is most spent by creating a sequenceUPDATE sSET s.Seq = s.RecID - e.minIDFROM #Stage AS sINNER JOIN ( SELECT CardNumber, MIN(RecID) AS minID FROM #Stage GROUP BY CardNumber ) AS e ON e.CardNumber = s.CardNumber-- Show the final resultSELECT CardNumber, MAX(CASE WHEN Seq = 0 THEN SubsidiaryID ELSE '' END) AS SubsidiaryWithMostSpend, SUM(Amount) AS TotalSpendFROM #Stage AS sGROUP BY CardNumberORDER BY CardNumber-- Clean upDROP TABLE #Stage E 12°55'05.25"N 56°04'39.16" |
 |
|
velvettiger
Posting Yak Master
115 Posts |
Posted - 2008-04-25 : 10:47:51
|
Ok Peso,I get you. Actually I did joined "subsidiaryByCustomerByClassTransactionDetail" while INSERTing to the #stage table but I did it again at the bottom and when I think about it, that doesnt make much sense.One other thing Peso, I dont understand what is going on in this line in the third function.s.Seq = s.RecID - e.minID |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-04-25 : 10:57:45
|
All records have a sequential number, sorted by cardnumber ascending and SUM() descending.That makes same group of cardnumbers have a sequential range of recid with no gaps.I then fetch the minimum recid for each group and subtract from actual sequence (recid).Before (just after insert)919 A 100920 A 90921 A 80922 B 60923 B 59 Now I fetch minimum recid919 A 100 919920 A 90 919921 A 80 919922 B 60 922923 B 59 922 And now I subtract them919 A 100 919 0920 A 90 919 1921 A 80 919 2922 B 60 922 0923 B 59 922 1 So that for each and one cardnumber, the record with maximum amount spent gets sequence number zero.This is easy verifiable if you just do a select * from #stage order by recid E 12°55'05.25"N 56°04'39.16" |
 |
|
|