Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
rcp
Starting Member
32 Posts |
Posted - 2012-02-20 : 12:55:14
|
| Hi, I have a table that stores the results as XML, example of a result<classifications totalrecords="62"><classification numrecords="4">Dairy Farmers</classification><classification numrecords="6">Take Away Meal Outlets</classification><classification numrecords="52">Public Houses Activities Of</classification></classifications>I need to split the results so that I haveTotals 62Dairy Farmers 4Take Away Meals 6Public Houses Activities Of 52Any ideas would be welcomed with open arms. Even if the results had the Total, Dairy Farmers etc. as headers that would be fine as well. There are about 5 million records each with 3 or less sets of classifications.Regards,Roland |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2012-02-20 : 13:26:30
|
see an example hereDECLARE @X xmlSET @X='<classifications totalrecords="62"><classification numrecords="4">Dairy Farmers</classification><classification numrecords="6">Take Away Meal Outlets</classification><classification numrecords="52">Public Houses Activities Of</classification></classifications>'[code]SELECT t.u.value('.','varchar(100)') as desc,t.u.value('./@numrecords','int') as count,t.u.value('../@totalrecords,'int') as totalcountFROM @X.nodes('/classifications/classification')t(u)------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
rcp
Starting Member
32 Posts |
Posted - 2012-02-21 : 10:11:45
|
| I have tidied the results using this querySELECT REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( classifications ,'classifications','') ,'Classification','') ,'<','') ,'>','') ,'/','') ,'numrecords="',',') ,' totalrecords="','') ,'"',',') ,', ,',',Totals,') ,' ,',',') as fishFROM [icd_log].[dbo].[icdlog_2012_02]where classifications not like '<classifications totalrecords="0"></classifications>'RESULTS LOOK LIKE THIS:210,Totals,210,Take Away Meal Outlets1,Totals,1,Beer Wine And Spirit (Retail)1,Totals,1,Cafes And Snack Bars6,Totals,6,Fast Food Delivery210,Totals,210,Take Away Meal Outlets1,Totals,1,Florists1,Totals,1,Restaurant - Pizza4,Totals,2,Electrical Contractors And Electricians,2,Specialised Building Trade ContractorsWhat I would like to do now is split the results so that they look like this:Numbers Market Sections4 Totals2 Electrical Contractors And Electricians2 Specialised Building Trade ContractorsI have been told that I can do it with a loop query, not sure how though.Cheers,Roland |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-02-21 : 10:34:31
|
| You can use a split function - for example, the one listed in Fig 21 of this page: http://www.sqlservercentral.com/articles/Tally+Table/72993/However, the approach Visakh suggested using the XML methods would be a MUCH better approach, so I would recommend trying to make that work. |
 |
|
|
rcp
Starting Member
32 Posts |
Posted - 2012-02-21 : 10:53:22
|
| Okay, have used the original XML method, is there a way to select the results from a query to use as SET. There are over 5 million sets of result. Below is a larger sample set.<classifications totalrecords="24"><classification numrecords="6">Doctors</classification><classification numrecords="6">Hairdressers (Unisex)</classification><classification numrecords="12">Dentists</classification></classifications><classifications totalrecords="28"><classification numrecords="8">Solicitors</classification><classification numrecords="10">Kitchen Planners And Installers</classification><classification numrecords="10">Jewellery Retail Sale Of</classification></classifications><classifications totalrecords="8"><classification numrecords="2">Engineers (Consulting)</classification><classification numrecords="2">Gardening Services</classification><classification numrecords="4">Estate Agents</classification></classifications><classifications totalrecords="8"><classification numrecords="2">Beauty Salons</classification><classification numrecords="2">Take Away Meal Outlets</classification><classification numrecords="4">Doctors</classification></classifications><classifications totalrecords="1"><classification numrecords="1">Schools (Local Authority)</classification></classifications><classifications totalrecords="2"><classification numrecords="2">Funeral Services</classification></classifications><classifications totalrecords="1"><classification numrecords="1">Banks And Financial Institutions</classification></classifications><classifications totalrecords="1"><classification numrecords="1">Doctors</classification></classifications> |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-02-21 : 11:35:01
|
Since your data is in a table, modify Visakh's query slightly as shwon below. Try it on a small sample table and see if you get the results you want.SELECT t.u.value('.','varchar(100)') as [desc],t.u.value('./@numrecords','int') as count,t.u.value('../@totalrecords','int') as totalcountFROM YourTable YTcross apply YT.YourXMLCol.nodes('/classifications/classification')t(u)If you don't get the results you are looking for, can you post the exact output you are looking for for the sample data you posted in your latest post?With 5 million records, this may be slow. If you need to do this query frequently, you may want to consider adding an XML primary index (at least). However, the space requirements for XML indexes are quite large, so that may be a consideration. |
 |
|
|
rcp
Starting Member
32 Posts |
Posted - 2012-02-21 : 12:07:34
|
| This is the query I have so farSELECT t.u.value('.','varchar(100)') as [Market section],t.u.value('./@numrecords','int') as Counts,t.u.value('../@totalrecords','int') as [Total Count]FROM [icd_log].[dbo].[icdlog_2012_02] YTcross apply YT.classifications.nodes('/classifications/classification')t(u)but it gives me an error messageMsg 9506, Level 16, State 1, Line 2The XMLDT method 'nodes' can only be invoked on columns of type xml.The query works fine when I set a variable e.g. <classifications totalrecords="24"><classification numrecords="6">Doctors</classification><classification numrecords="6">Hairdressers (Unisex)</classification><classification numrecords="12">Dentists</classification></classifications>And declare the variable as xml. Any ideas from here? |
 |
|
|
rcp
Starting Member
32 Posts |
Posted - 2012-02-21 : 12:35:33
|
| I have changed the query slightly SELECT t.u.value('.','varchar(100)') as [Market section],t.u.value('./@numrecords','int') as Counts,t.u.value('../@totalrecords','int') as [Total Count]FROM (SELECT CAST(classifications as xml) AS classifications FROM [icd_log].[dbo].[icdlog_2012_02]) YTCROSS APPLY YT.classifications.nodes('/classifications/classification')t(u)but am now getting the error message:Msg 9421, Level 16, State 1, Line 1XML parsing: line 1, character 77, illegal name characterRegards,Roland |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-02-21 : 12:50:27
|
This is most likely because the data type of the column is not XML. It probably is varchar or nvarchar. You can cast it to XML and run the same query, but there are couple of concerns:a) Casting to XML may fail because of data issues - if the string data is not well-formed XML this will happen.b) Additional overhead of converting to XML and then shredding it. Given that, I am not sure if you will gain any performance advantage over your initial thought of treating it as a string and doing string replacements. Regardless, worht a try - cast it to XML like this:....FROM (SELECT CAST(YourStringColumn AS XML) AS xmlCol FROM YourTable) YTcross apply xmlcol.nodes('/classifications/classification')t(u) |
 |
|
|
rcp
Starting Member
32 Posts |
Posted - 2012-02-22 : 04:32:24
|
| When I try to CAST the column to XML, I get the error message:Msg 9421, Level 16, State 1, Line 1XML parsing: line 1, character 77, illegal name characterI could change the data in the column so that it is comma delimited by using a replace. e.g.<classifications totalrecords="62"><classification numrecords="4">Dairy Farmers</classification><classification numrecords="6">Take Away Meal Outlets</classification><classification numrecords="52">Public Houses Activities Of</classification></classifications>52,Totals,4,Dairy Farmers,6,Take Away Meal Outlets,52,Public Houses Activities OfCan I then split this? What do you thinkRegards,Roland |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-02-22 : 05:34:54
|
quote: Originally posted by rcp When I try to CAST the column to XML, I get the error message:Msg 9421, Level 16, State 1, Line 1XML parsing: line 1, character 77, illegal name characterI could change the data in the column so that it is comma delimited by using a replace. e.g.<classifications totalrecords="62"><classification numrecords="4">Dairy Farmers</classification><classification numrecords="6">Take Away Meal Outlets</classification><classification numrecords="52">Public Houses Activities Of</classification></classifications>52,Totals,4,Dairy Farmers,6,Take Away Meal Outlets,52,Public Houses Activities OfCan I then split this? What do you thinkRegards,Roland
That is what I was afraid of. That some of the data may not be weii-formed XML.To split the comma separated string, you can use the function in the link in my reply on 2/21/2012. Or, search this forum, younwill find many examples. |
 |
|
|
rcp
Starting Member
32 Posts |
Posted - 2012-02-22 : 06:04:00
|
| I have been informed that running a loop query would give me what I want. In the table there is a unique icdlog_guid. I could then use the original suggestion from visakh16 and loop through the icdlog_guid. What do you think? |
 |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-02-22 : 07:52:52
|
quote: Originally posted by rcp I have been informed that running a loop query would give me what I want. In the table there is a unique icdlog_guid. I could then use the original suggestion from visakh16 and loop through the icdlog_guid. What do you think?
That may not be the best solution - for two reasons:a) More likely than not, looping would result in very poor performance.b) You would still run into the problem of XML that is not well-formed.What I was suggesting was that, since you already have a way of getting the data that you need as a comma-separated string even when it is not well-formed XML, to take that comma-separated string and splitting it into rows. |
 |
|
|
rcp
Starting Member
32 Posts |
Posted - 2012-02-22 : 08:14:22
|
| Okay, if I change it into comma-separated string, can I split it so that it is in columns per row?e.g.results210,Totals,210,Take Away Meal Outlets,10,Curry1,Totals,1,Beer Wine And Spirit (Retail),10,Totals,1,Cafes And Snack Bars,4,Chocolaterequired resultsA B C D E F210 Totals 210 Take Away Meal Outlets 10 Curry1 Totals 1 Beer Wine And Spirit (Retail) NULL NULL10 Totals 1 Cafes And Snack Bars 4 Chocolate |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
|
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-02-22 : 09:43:38
|
Visakh, I can't see your blog - my place of work very wisely blocks dangerous blogs ;) so I might be repeating what you have in the blog. But, rcp, here is some example code. The example uses the parsing function from Fig. 21 in Jeff Moden's article http://www.sqlservercentral.com/articles/Tally+Table/72993/ So if you want to run this test, you will need to install that function.The temp table that I have is for generating test data. ---- TEST DATA ---------------------------------------CREATE TABLE #tmp (id int, col1 VARCHAR(8000));INSERT INTO #tmp VALUES (1, '210,Totals,210,Take Away Meal Outlets,10,Curry')INSERT INTO #tmp VALUES (2, '1,Totals,1,Beer Wine And Spirit (Retail),')INSERT INTO #tmp VALUES (3, '10,Totals,1,Cafes And Snack Bars,4,Chocolate')-----USING Splitter function ---------------------------SELECT *FROM #tmp CROSS APPLY (SELECT * FROM MASTER.dbo.DelimitedSplit8K(col1,',') ) s------PIVOTING --------------------------------------;WITH cte AS(SELECT Id,ItemNumber,ItemFROM #tmp CROSS APPLY (SELECT * FROM MASTER.dbo.DelimitedSplit8K(col1,',') ) s)SELECT *FROM ctePIVOT(MAX(Item) FOR ItemNumber IN ([1],[2],[3],[4],[5],[6]))P---- cleanupDROP TABLE #tmp; |
 |
|
|
|
|
|
|
|