Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SSIS and Import/Export (2008)
 strange output from bcp

Author  Topic 

cjp
Yak Posting Veteran

69 Posts

Posted - 2011-09-07 : 05:33:31
I am using SQLS 2008 x64 SP 2 running on W 7 Pro x64 SP 1.

I have been outputting manipulated data from a large medical database to txt files for statistical analysis. I have used bcp to export the data direct to the usb disk where I am holding the backups. The dataset is for a study on a serious arterial disease called telangiectasia. Here is the query (which, essentially, works fine):

USE [telangiectasia]
GO

IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[outtelang]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[outtelang]

GO

CREATE PROCEDURE dbo.outtelang
AS
BEGIN
SELECT
'combid', 'evdate', 'evdatereal', 'medcode', 'descr'
Union All
Select
Convert(varchar (100), combid),
Convert(varchar (100), evdate),
Convert(varchar (100), evdatereal, 103),
Convert(varchar (100), medcode),
Convert(varchar (100), descr)
FROM ##telang
END

GO

EXEC xp_cmdshell

'bcp "EXEC telangiectasia.dbo.outtelang" queryout "L:\telangiec\telang.txt" -T -c -a32768'

GO

The bcp takes its data from a temp table (##telang). This is very fast and everything outputs correctly except for some of the entries in the last field 'descr' (description). For reasons that baffle me, most of the entries are correct but some of them add spurious extra characters; this seems to be restricted to descriptions that are quite short (longer ones output correctly even though they are under 100 characters in length). Here is what it looks like:

Telang_events
combid evdate evdatereal medcode descr
a670600F@ 20100511 11/05/2010 P736.11 Pulmonary arterio-venous fistulaÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
a670601hD 20040000 7A60.00 Arteriovenous shuntÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
a670602Vf 20000000 G770.00 Hereditary Haemorrhagic Telangiectasia
a673201Kz 19950901 01/09/1995 G770.00 Hereditary Haemorrhagic Telangiectasia
a6734005m 20071012 12/10/2007 7A60.00 Arteriovenous shuntÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
a673407RK 20100805 05/08/2010 7A60.00 Arteriovenous shuntÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
a674400tH 19990302 02/03/1999 G770.00 Hereditary Haemorrhagic Telangiectasia

Sorry for the layout problems.

As you can see, the first four fields are fine and the issue is restricted to shorter entries in the final field. I should add that, if I make a saved table and then pull this into Access, there is no problem - the description field is perfect. So I don't think there are underlying problems in the data - it is the bcp process that is generating the spurious elements.

By the way, if a date string cannot be parsed into date format, evdatereal is blank (eg 20040000).

Can anyone help me to understand this and to fix the issue with the description field?

Thanks.

Chris

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2011-09-07 : 08:39:40
can you please show the code that is filling the table ##telang with data?


No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

cjp
Yak Posting Veteran

69 Posts

Posted - 2011-09-08 : 03:22:59
I was about to post the requested code when I had a thought that means I have found the solution (there is nothing wrong with bcp).

The lookup for diagnoses comes in a professionally prepared dictionary that has no spurious characters. For this study, however, a student had hacked the dictionary and had presented me with his hack. For some reason, he had introduced spaces (and even a couple of carriage returns) into the description field for shorter entries whilst he was preparing the lookup. Perhaps he was trying to even up the length of the field for each entry.

The long and the short of it is that I tested this hypothesis by identifying and removing the spurious spaces in his lookup and, lo and behold, the procedure worked perfectly - no nonsense in the output.

I suppose this shows that nothing should be taken for granted. Students have been instructed not to hack dictionaries in Excel, but they do it anyway. So... I will simply have to assume that lookups are broken and go looking for trouble before I use them.

Sorry to have posted unnecessarily on this but I hope the cautionary tale will be of use: don't use other people's data until you have established that it is bullet-proof. Thanks.

Chris
Go to Top of Page
   

- Advertisement -