Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Spilt Up

Author  Topic 

satheesh
Posting Yak Master

152 Posts

Posted - 2011-06-29 : 10:30:32
Dear All

How to split the following comma-separated address into its' constituent parts?

declare @address nvarchar(1024);
set @address = 'Mrs Dorit McCann,34 Maryville Park,Belfast,County Antrim,BT9 6LP';

Any help would be highly apreciated

Regards,
SG

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2011-06-29 : 10:35:21
see here for example:
http://weblogs.sqlteam.com/dinakar/archive/2007/03/28/60150.aspx

The shown sp is also possible as function:
CREATE FUNCTION dbo.fnParseArray (@array VARCHAR(1000),@separator CHAR(1))
RETURNS @T Table (occ int,col1 varchar(50))
AS
BEGIN
--DECLARE @T Table (col1 varchar(50))
-- @Array is the array we wish to parse
-- @Separator is the separator charactor such as a comma
DECLARE @separator_position INT -- This is used to locate each separator character
DECLARE @array_value VARCHAR(1000) -- this holds each array value as it is returned
-- For my loop to work I need an extra separator at the end. I always look to the
-- left of the separator character for each array value
declare @occurence int =1 -- to get the Nth occurence
if (left(@array,1)=@separator)
begin
set @array=stuff(@array,1,1,'')
end
if (right(@array,1)<>@separator)
begin
SET @array = @array + @separator
end

-- Loop through the string searching for separtor characters
WHILE PATINDEX('%' + @separator + '%', @array) <> 0
BEGIN
-- patindex matches the a pattern against a string
SELECT @separator_position = PATINDEX('%' + @separator + '%',@array)
SELECT @array_value = LEFT(@array, @separator_position - 1)
-- This is where you process the values passed.
INSERT into @T VALUES (@occurence,@array_value)
set @occurence = @occurence + 1
-- Replace this select statement with your processing
-- @array_value holds the value of this element of the array
-- This replaces what we just processed with and empty string
SELECT @array = STUFF(@array, 1, @separator_position, '')
END
RETURN
END

-- example call
select * from
dbo.fnParseArray ('/23/34/45/67/89/101/','/')
where occ=1




No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2011-06-29 : 11:18:50
Here is a Fantastic article on splitting
http://www.sqlservercentral.com/articles/Tally+Table/72993/

Corey

I Has Returned!!
Go to Top of Page

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2011-06-29 : 11:26:24
CREATE TABLE InputStrings
(keycol CHAR(10) NOT NULL PRIMARY KEY,
input_string VARCHAR(255) NOT NULL);

INSERT INTO InputStrings VALUES ('first', '12,34,567,896');
INSERT INTO InputStrings VALUES ('second', '312,534,997,896');
etc.

This will be the table that gets the outputs, in the form of the original key column and one parameter per row.

It makes life easier if the lists in the input strings start and end with a comma. You will need a table of sequential numbers -- a standard SQL programming trick, Now, the query,

CREATE VIEW ParmList (keycol, place, parm)
AS
SELECT keycol,
COUNT(S2.seq), -- reverse order
CAST (SUBSTRING (I1.input_string
FROM S1.seq
FOR MIN(S2.seq) - S1.seq -1)
AS INTEGER)
FROM InputStrings AS I1, Series AS S1, Series AS S2
WHERE SUBSTRING (',' + I1.input_string + ',', S1.seq, 1) = ','
AND SUBSTRING (',' + I1.input_string + ',', S2.seq, 1) = ','
AND S1.seq < S2.seq
GROUP BY I1.keycol, I1.input_string, S1.seq;

The S1 and S2 copies of Series are used to locate bracketing pairs of commas, and the entire set of substrings located between them is extracted and cast as integers in one non-procedural step. The trick is to be sure that the right hand comma of the bracketing pair is the closest one to the first comma. The relative position of each element in the list is given by the value of "place", but it does a count down so you can plan horizontal placement in columns.

This might be faster now:

WITH Commas(keycol, comma_seq, comma_place)
AS
(SELECT I1.keycol, S1.seq,
ROW_NUMBER() OVER (PARTITION BY I1.keycol ORDER BY S1.seq)
FROM InputStrings AS I1, Series AS S1
WHERE SUBSTRING (',' || I1.input_string || ','
FROM S1.seq
FOR 1) = ','
AND S1.seq <= CHARLENGTH (I1.input_string))

SELECT SUBSTRING(',' || I1.input_string || ','
FROM C1.comma_place +1
FOR C2.comma_place - C1.comma_place - 1)
FROM Commas AS C1, Commas AS C2
WHERE C2.comma_seq = C1.comma_seq + 1
AND C1.keycol = C2.keycol;

The idea is to get all the positions of the commas in the CTE and then use (n, n+1) pairs of positions to locate substrings. The hope is that the ROW_NUMBER() is faster than the GROUP BY in the first attempt. Since it is materialized before the body of the query (in theory), there are opportunities for parallelism indexing and other things to speed up the works.

Hey, I can write kludges with the best of them, but I don't. You need to at the very least write a routine to clean out blanks, handle double commas and special characters in the strings. Clean up address data with a special tool like Melissa Data instead of writing it yourself.

--CELKO--
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Go to Top of Page
   

- Advertisement -