Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 The strange case of nulls and zero length strings

Author  Topic 

cjp
Yak Posting Veteran

69 Posts

Posted - 2011-03-30 : 07:35:22
I am using SQLS 2005, sp3

For some years, I have worked with unseparated strings as a data source for a medical database. This holds space literals where a value is missing (eg ........ if a date is missing when it would otherwise be in the form yyyymmdd). When importing, my script runs a simple isdate() procedure to dump the date if it is there and a null if it isn't. So far so good.

Recently, our data supplier sent me some data in separated text format (csv - though I have tested alternatives and found the same problems if I use tab-separation). When I tried to convert the strings to dates, I found that missing values were being held as zero-length strings ('') and, when I ran isdate() on the date field, it returned 01/01/1900 + the time parts for this value.

This prompted a small panic but I eventually found that I could use a Case procedure to convert '' to a null value when a date was not present. I also found that Convert(datetime, fieldname) works much more reliably than isdate(). So:

smokingdate = Case When demography.smokedate Like '' Then Null Else Convert(datetime, demography.smokedate, 103) End

worked fine (103 is there because the data had been provided with date strings in UK order).

When I output my table in tab-sep txt format and then re-imported it to check that everything was working, I found that nulls were being held as zero-length strings. - So I had to fiddle around again in order to get everything to play nicely.

So: 1 - Why does Sequel Server read a zls as 01/01/1900 and 2 - Is there any way of avoiding the presence of zero-length strings (stipulating nulls or, perhaps, space literals instead) when exporting to/importing from separated text format? (I have been using the export/import wizard to handle these operations)

Thanks.

Chris

Seventhnight
Master Smack Fu Yak Hacker

2878 Posts

Posted - 2011-03-30 : 08:51:46
1/1/1900 is a default ( a min date if you will)... I'm not sure I followed the rest of the exporting/reimporting issue, but maybe a little change to the 'conversion' will help

you don't really need a case statement... nullif() will do. and I find it good practice to ltrim(rtrim()) before I use nullif for this purpose...

smokingdate = Convert(datetime, nullif(ltrim(rtrim(demography.smokedate)),''), 103)




Corey

I Has Returned!!
Go to Top of Page

cjp
Yak Posting Veteran

69 Posts

Posted - 2011-03-31 : 04:03:13
Thanks for the suggestion - makes good sense.

Yesterday, I trawled the web on this topic and found a huge literature - I am clearly not alone in my uncertainties. The following link is to an article that summed up clearly my reasons for posting this topic:

http://decipherinfosys.wordpress.com/2009/03/27/null-vs-empty-zero-length-string/

Sorry if my questions were not very clear - it is difficult to put all this into words. On reflection, my question is this:

When importing from a separated txt/csv table into SQL Server, what ways exist to tell SQLS to treat the absence of a value in the source data as a null rather than as a zero-length string? I envisage a situation in which, for example, there are two consecutive tabs in a tab-sep source, and I would like SQLS to dump the outcome as a null rather than as a zls.

Someone told me that Oracle can be set up to treat zls's and nulls as the same thing (sounds good to me) - ie as null values - and I would certainly like to be able to change all zls's to nulls during import to SQLS (otherwise, the various fixes have to be applied post hoc, and this could be very time-consuming).

Thanks.

Chris
Go to Top of Page

jcelko
Esteemed SQL Purist

547 Posts

Posted - 2011-03-31 : 12:04:04
I don't like to clean up data in SQL. I would write a short C, BASIC, AWK or whatever scrubbing routine that goes thru the source file and does all that stuff. It is easy to do capitalization, map
"<comma><comma>" to "<comma>NULL<comma>", etc. in one pass.


--CELKO--
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Go to Top of Page
   

- Advertisement -