Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 Parse a text field into component parts

Author  Topic 

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-23 : 18:12:36
How do I convert dates that are buried in a text string?

the field is a log field. The application just does an append to the field, by adding a delimiter, a integer representing the date, a delimiter, then the log comment. There are many of these log comments strung together int this text field. The application has code to do the date conversions in the app GUI...how could I do that conversion in a query?...SSIS?...

the field looks like this...

1184070983tom.littleThe server name DRSJOLIO1 TAG # V042271184072010billy.fosterThe server name ORSJOLIO1 LAPTOP TAG # V04227 1184163331billy.fosterUser reports she still cannot synch her laptop.1184163950billy.fosterTalked to Jeremy to see if his people are aware of the ticket out there. Jeremy stated they were able to log in and that he would remind them.1184170909billy.fosterLeft Jeremy a voicemail requesting he get the ticket assigned to someone.


to convert these integers I use: dateadd(s,Create_Date,'19700101')

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-09-23 : 18:27:21
Can you post the code that the app uses to do the conversion?

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-24 : 09:33:01
sorry...its a COTS product, and the vendor considers it a trade secret...

grrrrrrrrrrrr.....
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2010-09-24 : 09:56:55
Assuming your example was dated July 10, 2007:

select dateadd(s, 1184070983,'1/1/1970')

Some trade secret, it's the number of seconds from Unix epoch (1/1/1970). Tell your vendor nice try, they should use the internet someday.

If that's the correct date let me know, I think I have a solution.

By the way, the example data you posted has embedded line feeds in there that mess up copy paste. Another indication this is from a Unix system, correct?
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-24 : 10:53:54
rob..mm..yes..that is the correct date...but, I need more than just the date converter...(not sure about the host os..)...

I figured out the epoc date...for a date field (which is really an integer field) I use: dateadd(s,Create_Date,'19700101')
to do a conversion when the number of seconds, like the cheese, stands alone in a field...

my problem is this chitty log text field..when the epoc date is embedded in a text stream (surrounded by those special characters you mentioned)... I am guessing the vendor's client side app has the code to parse, convert, then reassemble the log field so that the epoc dates are readable....

sooo..how do I parse... (I have the convert part)...then reconstruct the log field so that the dates are now..well..dates... instead of the number of seconds from 1/1/1970...????.. enough to drive a guy to drink!
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2010-09-24 : 11:07:26
Sorry, I missed your update earlier. I also made a mistake, they used char(3) and char(4) as delimiters (sneaky bastards), instead of line feeds. (See here for an ASCII table: http://www.asciitable.com/)

The following code parses just the date, if you need the other columns too let me know:
with n(n) as (select 1 union all select n+1 from n where n<50),  -- generate numbers
a(a) as (select row_number() over (order by z.n) from n z cross join n y), -- generate numbers
c3(c3) as (select char(3)), -- shorthand for char(3)
c4(c4) as (select char(4)), -- shorthand for char(4)
d(d) as (select c3 + data + c3 from #doi cross join c3), -- prepare data for parsing, change #doi and data to match your table and column
z(s,n,c) as (select substring(d,a,charindex(c3,d,a+1)-a), a, charindex(c3,d,a+1) -- slice each log record on char(3) boundary
from a cross join d cross join c3 where a<len(d) and substring(d,a-1,1)=c3)
select dateadd(second, cast(left(s,charindex(c4,s)-1) as int), '1/1/1970') as LogDate -- convert to date
from z cross join c3 cross join c4
Sorry for the mess. I used multiple CTEs because frankly, the non-CTE version was even uglier. Substring() and Charindex() are too long for function names.

Looking at it now I think I can clean it up, but I need some time away from it.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2010-09-24 : 11:19:16
That didn't last long, here's a slightly shorter version:
;with 
c3(c3) as (select char(3)), -- shorthand for char(3)
n(n) as (select 1 union all select n+1 from n where n<50), -- generate numbers
a(a) as (select row_number() over (order by z.n) from n z cross join n y), -- generate more numbers
-- prepare data for parsing, change #doi and data to match your table and column names
d(d) as (select c3 + data + c3 from #doi cross join c3),
-- slice each log record on char(3) boundary
z(s,n,x) as (select substring(d,a,charindex(c3,d,a+1)-a), charindex(char(4),d,a+1), a
from a cross join d cross join c3 where a<len(d) and substring(d,a-1,1)=c3)
select dateadd(second, cast(left(s,n-x) as int), '1/1/1970') as LogDate -- convert to date
from z
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-24 : 11:20:13
yikes... your taking me to a whole new sql coding level!... but I get the drift...

I believe the parser should produce:LogDate; UserName;Comment ... then the delimiters start the sequence again..and again...depending on how many comments are in the log field....

1184070983tom.littleThe server name DRSJOLIO1 TAG # V042271184072010
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-24 : 11:37:06
ahh...#doi = RemedyChangeDatamart (my table), and data = CHG_Work_Log (the work log text field)... give me an error (note- I am in sql 2008):
Msg 402, Level 16, State 1, Line 1
The data types char and text are incompatible in the add operator.

;with
c3(c3) as (select char(3)), -- shorthand for char(3)
n(n) as (select 1 union all select n+1 from n where n<50), -- generate numbers
a(a) as (select row_number() over (order by z.n) from n z cross join n y), -- generate more numbers
-- prepare data for parsing, change #doi and data to match your table and column names
d(d) as (select c3 + CHG_Work_Log + c3 from RemedyChangeDataMart cross join c3),
-- slice each log record on char(3) boundary
z(s,n,x) as (select substring(d,a,charindex(c3,d,a+1)-a), charindex(char(4),d,a+1), a
from a cross join d cross join c3 where a<len(d) and substring(d,a-1,1)=c3)
select dateadd(second, cast(left(s,n-x) as int), '1/1/1970') as LogDate -- convert to date
from z
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2010-09-24 : 11:51:08
Try this:
;with 
c3(c3) as (select char(3)), -- shorthand for char(3)
c4(c4) as (select char(4)), -- shorthand for char(4)
n(n) as (select 1 union all select n+1 from n where n<50), -- generate numbers
a(a) as (select row_number() over (order by z.n) from n z cross join n y), -- generate more numbers
-- prepare data for parsing, change #doi and data to match your table and column names
d(d) as (select c3 + CHG_Work_Log + c3 from #RemedyChangeDatamart cross join c3),
-- slice each log record on char(3) boundary
z(s,n,x) as (select substring(d,a,charindex(c3,d,a+1)-a), charindex(c4,d,a+1), a
from a cross join d cross join c3 cross join c4 where a<len(d) and substring(d,a-1,1)=c3)
select dateadd(second, cast(left(s,n-x) as int), '1/1/1970') as LogDate, -- convert to date
substring(s,charindex(c4,s)+1,charindex(c4,s,charindex(c4,s)+1)-charindex(c4,s)-1) UserName, -- parse UserName
substring(s,charindex(c4,s,charindex(c4,s)+1)+1,1024) Comment --parse Comment
from z cross join c4
I changed the column and table names to match your definition, just remove the "#". If you need to add more columns let me know their names and I'll tweak the code.
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-24 : 12:14:18
same error...?..I did remove the # before my table name...

Msg 402, Level 16, State 1, Line 1
The data types char and text are incompatible in the add operator.

;with
c3(c3) as (select char(3)), -- shorthand for char(3)
c4(c4) as (select char(4)), -- shorthand for char(4)
n(n) as (select 1 union all select n+1 from n where n<50), -- generate numbers
a(a) as (select row_number() over (order by z.n) from n z cross join n y), -- generate more numbers
-- prepare data for parsing, change #doi and data to match your table and column names
d(d) as (select c3 + CHG_Work_Log + c3 from RemedyChangeDatamart cross join c3),
-- slice each log record on char(3) boundary
z(s,n,x) as (select substring(d,a,charindex(c3,d,a+1)-a), charindex(c4,d,a+1), a
from a cross join d cross join c3 cross join c4 where a<len(d) and substring(d,a-1,1)=c3)
select dateadd(second, cast(left(s,n-x) as int), '1/1/1970') as LogDate, -- convert to date
substring(s,charindex(c4,s)+1,charindex(c4,s,charindex(c4,s)+1)-charindex(c4,s)-1) UserName, -- parse UserName
substring(s,charindex(c4,s,charindex(c4,s)+1)+1,1024) Comment --parse Comment
from z cross join c4
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2010-09-24 : 13:23:21
OK, this should fix it:
;with 
c3(c3) as (select char(3)), -- shorthand for char(3)
c4(c4) as (select char(4)), -- shorthand for char(4)
n(n) as (select 1 union all select n+1 from n where n<50), -- generate numbers
a(a) as (select row_number() over (order by z.n) from n z cross join n y), -- generate more numbers
-- prepare data for parsing, change #doi and data to match your table and column names
d(d) as (select c3 + cast(CHG_Work_Log as varchar(max)) + c3 from #RemedyChangeDatamart cross join c3),
-- slice each log record on char(3) boundary
z(s,n,x) as (select substring(d,a,charindex(c3,d,a+1)-a), charindex(c4,d,a+1), a
from a cross join d cross join c3 cross join c4 where a<len(d) and substring(d,a-1,1)=c3)
select dateadd(second, cast(left(s,n-x) as int), '1/1/1970') as LogDate, -- convert to date
substring(s,charindex(c4,s)+1,charindex(c4,s,charindex(c4,s)+1)-charindex(c4,s)-1) UserName, -- parse UserName
substring(s,charindex(c4,s,charindex(c4,s)+1)+1,1024) Comment --parse Comment
from z cross join c4
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-27 : 10:40:48
^5.... works like a champ!...

I'll dig into the code and try to understand what you did...then I'll know how to fish!
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2010-09-27 : 10:43:47
The parsing technique is explained here:

http://www.sqlteam.com/article/parsing-csv-values-into-multiple-rows
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-27 : 10:58:51
ahhh.... got it...thanks for the reference to the explaination...

the current solution, parses the work log... next I need to un-parse the work log segments BACK into a single work log field (with the dates converted) that will go with the original record...

I am thinking I can just concatenate the fields, but not sure how to break the loop per record..?..at row-number?:

substring(s,charindex(c4,s,charindex(c4,s)+1)+1,1024) Comment --parse Comment
,WorkLog = cast(dateadd(second, cast(left(s,n-x) as int), '1/1/1970') as varchar(max)) + ' - ' +substring(s,charindex(c4,s)+1,charindex(c4,s,charindex(c4,s)+1)-charindex(c4,s)-1) + ' - ' +substring(s,charindex(c4,s,charindex(c4,s)+1)+1,1024)
from z cross join c4
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2010-09-27 : 11:28:44
Do you mean you want to actually update that column to change the Unix dates to normal? That's going to be very hard, especially since it's a text column, and you probably shouldn't anyway, it will break their (probably already broken in some way) software.

Did you instead want a view that parses the dates? Do you really need them to be concatenated back into a single stream?
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-27 : 12:25:30
Im pulling the data from the operational database, and pushing it into a 'datamart' that will be used by analysts for slice and dice...so, no...it will not break the application...and yes, I really just want those epoc integer fields converted to readable dates...in the same work_log field of each record...

sooo...somewhere in your loop you break each delimited work log comment into a record..I really just need the silly epoc date converted and stuck back into the same field (or a new Converted_Work_Log field)...

make sense?
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2010-09-27 : 12:46:35
I see what you mean, but it strikes me as counterproductive if you're going to maintain the char(3)/(4) characters in there, especially for a data warehouse. Either store the fully parsed data only, or just leave it as-is and parse/convert on demand, or make a view out of the code.
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-09-27 : 18:03:59
to answer your question directly. convert and remove. I dont see a big difference in leaving them or not... my datamart is not that large (200K records)...and I consider this a datamart, not warehouse, (I view the difference as a mart is for a specific analytic task..warehouse more inclusive and general)... so I can play with the mart in ways I would not in a warehouse...

again, its an analytic datamart..not the operational database w/appliation...so I have full control over what is stored, how it is stored..

now, that said, the purpose is to put the data in a form that a business analyst (independent of the DBA) can slice and dice... not relational fields (all are converted to readable data)...

similarly, you can throw out the delimiters, or leave them in...the analyst will display the log field as a single text stream.... a business analyst will be able to read the log (date, submitter, comment...date, submitter, comment)...

in a reporting tool (like microsoft report writer, or crystal) we might get a bit more sophisticated in the display of the log, but for my initial purposes getting that rotten integer converted into a date goes SOOOO much farther than I was...

sooo..if I can just get the date converted in the field, leave the special chars or remove them if easier...they are noops for what I need at this point..

many thanks for your most excellent work!
Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-11-19 : 15:50:57
never mind...
I added the ID field into the d(ID,d) as (select change_id....
and then the ID into the z(ID,,s,n,x) ...select ID, ...

that made the ID field available in the final select query select ID, Left( ....



follow-up question..how do I add a field..?..in the table remedychangedatamart I have a field named Change_ID...I would like the change_ID for the record to appear with each line of the broken apart log field..that way I can trace it back to the original record...

Go to Top of Page

dlorenc
Posting Yak Master

172 Posts

Posted - 2010-11-19 : 16:22:41
but in doing this..I did notice that it seems your code is only pulling out the first comment..in the worklog field...there could be 0 thru n comments in the log...??
Go to Top of Page
    Next Page

- Advertisement -