Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SSIS and Import/Export (2005)
 bcp fails for 39gb file

Author  Topic 

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2008-04-07 : 14:15:18
I have a rather large text file that I am importing with BCP. It is known to have over 600 million rows. However, when I import it I get this:

SQLState = HY000, NativeError = 0
Error = [Microsoft][SQL Native Client]Unexpected EOF encountered in BCP data-file
214698811 rows copied.

so only about a third of the rows make it in. I am fairly certain that the file does not end after row 214698811. My certainty is based on the file size - other files with similar size and exactly the same schema managed to import fully and they have over 600m rows.

My question is, anyone have any ideas how I might be able to diagnose the problem with this file? Maybe a super-fantastic text editor I could view a 39gb text file with, and jump straight to row 214698811 to see if there is any weirdness there?




elsasoft.org

jhocutt
Constraint Violating Yak Guru

385 Posts

Posted - 2008-04-07 : 14:28:05
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=97416&SearchTerms=Billion

"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2008-04-07 : 15:44:14
thanks for the link, but that's not applicable. this file has less than MAX_INT rows. Plus, I am the OP on that post so I am well aware of it.


elsasoft.org
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2008-04-07 : 15:51:45
I'll probably just write my own app in compiled code to stream in this file till I get to row 214698811 and see if there's anything funky in the row following it.

Anyone have a better idea?


elsasoft.org
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2008-04-08 : 12:12:16
well, just for fun I tried the import again without changing anything and it worked fine. I guess it was a hiccup.


elsasoft.org
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-04-09 : 09:41:21
Text utilities are your friends!

To get the 10 lines before, the problem line and 10 lines after, something like this:

head -n 214698821 BCP-data-file.txt | tail -n 21

...although, if it turns out that there aren't at least 214698821 lines in the file, you won't get the lines you wanted!

You could even use nl to number the output lines (considering proliferation of magic numbers & consequent fragility, this is probably getting silly):

head -n 214698821 BCP-data-file.txt | tail -n 21 | nl -v 214698800

quote:
Originally posted by jezemine
My question is, anyone have any ideas how I might be able to diagnose the problem with this file? Maybe a super-fantastic text editor I could view a 39gb text file with, and jump straight to row 214698811 to see if there is any weirdness there?

Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2008-04-09 : 09:54:36
what are head and tail? are they unix commands? i don't seem to have them on windows.


elsasoft.org
Go to Top of Page

igorblackbelt
Constraint Violating Yak Guru

407 Posts

Posted - 2008-04-09 : 10:37:13
From my knowledge, yes head and tail are unix commands and I miss them (and others) very much since we had 3 Linux ETL servers at my old job. =/
Go to Top of Page

jhocutt
Constraint Violating Yak Guru

385 Posts

Posted - 2008-04-09 : 10:47:11
Head and tail and other unix command can be run on windows using Cygwin http://www.cygwin.com
or from GNU utilites http://www.gnu.org/software/coreutils/

"God does not play dice" -- Albert Einstein
"Not only does God play dice, but he sometimes throws them where they cannot be seen."
-- Stephen Hawking
Go to Top of Page
   

- Advertisement -