Author |
Topic |
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-04-07 : 14:15:18
|
I have a rather large text file that I am importing with BCP. It is known to have over 600 million rows. However, when I import it I get this:SQLState = HY000, NativeError = 0Error = [Microsoft][SQL Native Client]Unexpected EOF encountered in BCP data-file214698811 rows copied.so only about a third of the rows make it in. I am fairly certain that the file does not end after row 214698811. My certainty is based on the file size - other files with similar size and exactly the same schema managed to import fully and they have over 600m rows.My question is, anyone have any ideas how I might be able to diagnose the problem with this file? Maybe a super-fantastic text editor I could view a 39gb text file with, and jump straight to row 214698811 to see if there is any weirdness there? elsasoft.org |
|
jhocutt
Constraint Violating Yak Guru
385 Posts |
Posted - 2008-04-07 : 14:28:05
|
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=97416&SearchTerms=Billion"God does not play dice" -- Albert Einstein"Not only does God play dice, but he sometimes throws them where they cannot be seen." -- Stephen Hawking |
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-04-07 : 15:44:14
|
thanks for the link, but that's not applicable. this file has less than MAX_INT rows. Plus, I am the OP on that post so I am well aware of it.  elsasoft.org |
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-04-07 : 15:51:45
|
I'll probably just write my own app in compiled code to stream in this file till I get to row 214698811 and see if there's anything funky in the row following it. Anyone have a better idea? elsasoft.org |
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-04-08 : 12:12:16
|
well, just for fun I tried the import again without changing anything and it worked fine. I guess it was a hiccup. elsasoft.org |
 |
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2008-04-09 : 09:41:21
|
Text utilities are your friends!To get the 10 lines before, the problem line and 10 lines after, something like this:head -n 214698821 BCP-data-file.txt | tail -n 21...although, if it turns out that there aren't at least 214698821 lines in the file, you won't get the lines you wanted!You could even use nl to number the output lines (considering proliferation of magic numbers & consequent fragility, this is probably getting silly):head -n 214698821 BCP-data-file.txt | tail -n 21 | nl -v 214698800quote: Originally posted by jezemineMy question is, anyone have any ideas how I might be able to diagnose the problem with this file? Maybe a super-fantastic text editor I could view a 39gb text file with, and jump straight to row 214698811 to see if there is any weirdness there?
|
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2008-04-09 : 09:54:36
|
what are head and tail? are they unix commands? i don't seem to have them on windows. elsasoft.org |
 |
|
igorblackbelt
Constraint Violating Yak Guru
407 Posts |
Posted - 2008-04-09 : 10:37:13
|
From my knowledge, yes head and tail are unix commands and I miss them (and others) very much since we had 3 Linux ETL servers at my old job. =/ |
 |
|
jhocutt
Constraint Violating Yak Guru
385 Posts |
Posted - 2008-04-09 : 10:47:11
|
Head and tail and other unix command can be run on windows using Cygwin http://www.cygwin.comor from GNU utilites http://www.gnu.org/software/coreutils/"God does not play dice" -- Albert Einstein"Not only does God play dice, but he sometimes throws them where they cannot be seen." -- Stephen Hawking |
 |
|
|