Converting Multiple Rows into a CSV String (Set Based Method)By Jay White on 9 September 2002 | Tags: SELECT Based the problem outlined in RocketScientist's article (entitled Converting Multiple Rows into a CSV String), I would like to present an alternative method for accomplishing a similar task. This article will show a faster method to convert multiple rows into multiple CSV strings.
Say you had a table with a composite key, and you wanted to return each distinct first member and a CSV list of all the applicable second members. Using the approach outlined by RocketScientist, you would have to iterate through each distinct first member and run the sp_MakeCharList proc N times. I think there is a better way.
First, the DDL and sample data: create table Page47 ( i int not null, vc varchar(5) not null, constraint pk_Page47 primary key (i,vc) ) go set nocount on declare @i int select @i = 0 while @i <5000 begin insert into Page47 (i,vc) select round((rand() * 100), 0), char(round((rand() * 25), 0) + 97) + char(round((rand() * 25), 0) + 97) + char(round((rand() * 25), 0) + 97) + char(round((rand() * 25), 0) + 97) + char(round((rand() * 25), 0) + 97) select @i = @i + 1 end go The sample data should look something like this: select top 10 i, vc from Page47 order by newid() i vc ----------- ----- 743 nndsn 267 krtub 582 lgrrg 185 qmcku 907 rrpqe 635 cpbcp 237 uovqh 85 bpixy 554 mfsqh 49 geabv (10 row(s) affected) First, lets look at adapting RocketScientists solution to this problem. Here is how you might iterate through the table to build all of the CSVs --create a temp table select identity(int,1,1) as rowid, i into #workingtable from Page47 group by i --create a table to hold the lists create table #Rocket ( i int not null primary key, list varchar(8000) not null) declare @i int, @maxrowid int, @sql varchar(8000), @list varchar(8000) select @maxrowid = max(rowid), @i = 0 from #workingtable --loop through the all the i's while @i <= @maxrowid begin select @sql = 'select vc from Page47 where i = ' + convert(varchar,i) from #workingtable where rowid = @i exec sp_MakeCharList @codelistselect=@sql, @delimitedlist=@list output insert into #rocket (i,list) select i, @list from #workingtable where rowid = @i and @list is not null select @i = @i + 1 end --return a sample from the final rowset select top 10 i, case when len(list) > 50 then convert(varchar(50), left(list,47) + '...') else convert(varchar(50),list) end as list from #rocket order by newid() go i list ----------- -------------------------------------------------- 790 fpflf, hjutq, inahr, kgkox, kroox, nbkiy, rsscw... 752 imokv, jcrgz, khsui, lbmwm, pkjee, ujxoi, ulgsm... 74 aphdr, cdmoc, fogdm, gslsg, hushk, icchc, lybud... 320 bvevh, ohyqv, oivyc, yevjb, ypnrk, yxfji 956 aisgk, emeup, fhtad, hffnm, iidpn, ivqlk, jhklu... 530 bqcjr, dnfzr, dqwas, egqpm, fplip, ibrxp, jcbhk... 137 aorpw, bngki, cibny, hwsrp, jdown, qlepc, zekbo, z 72 ghrxi, gmkzk, mufmc, uixpr, vjvbu, ymruj, zmucr, z 911 fibrs, fuhub, hgcwh, kuvcf, lcfux, nrqht, pfaae... 62 edxyj, grrmt, hmkfv, hprlt, ieqvg, mojsk, pdrmv... Now, here is a set based method for doing the same thing. --create a table to work with create table #workingtable ( i int not null, vc varchar(5) not null, list varchar(8000), constraint pk_wt primary key (i,vc) ) insert into #workingtable (i,vc) select i,vc from Page47 order by i,vc declare @list varchar(8000), @lasti int select @list = '', @lasti = -1 --here is the meat of the work update #workingtable set @list = list = case when @lasti <> i then vc else @list + ', ' + vc end, @lasti = i --return a sample from the final rowset select top 10 i, case when len(max(list)) > 50 then convert(varchar(50), left(max(list),47) + '...') else convert(varchar(50),max(list)) end as list from #workingtable group by i order by newid() go i list ----------- -------------------------------------------------- 127 itvgq, ljosw, nxmdj, oshrp, plxff, pubig, sthck... 849 gcifo, hbxkf, njkdl, sfesm, sjhky, uxhfq, vjeno... 684 fejly, fqyqf, gpfce, hutht, kwywo, mapco, momqn... 461 fsofv, fzked, murat, vzmek, yrqjo 612 nmmey, tfjhv, ulwuj, xxaaq 374 bbthd, jvjwz, klcsq, mrakf, peztf, pixww, rtwdd 730 dlynf, egqei, hhckx, nsvdn, obnhh, rfbwh, ytgfi 458 eijdr, gtxhu, lhtqh, phprf, qjhcr, vqnos 655 bijer, fwlgk, nrcbm, sohho, trjtw, usjdj, uvpie... 837 ayxcv, epurf, flvtj, ftxcj, imjap, pmygd, sqhcc... My tests show the iterative method taking 2.813 seconds and the set-based method taking 0.670 seconds, with this set of sample data. Increasing the number of rows shows the cursor solution creeping up on the set based solution; however, increasing the number of distinct i’s allowed show the performance of the cursor sharply declining. With 1000 distinct i’s, the cursor solution dropped to 89.420 seconds while the set based solution screamed at a 1.513 seconds. As you can see, the implementation choice will be dependent on your data set, so you should test both methods to find which is right for you. One last thing, why does that update statement work in the set based solution? Honestly, your guess is as good as mine. You can read more about that here.
|
- Advertisement - |