Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SSIS and Import/Export (2008)
 Importing ridiculous amount of records.

Author  Topic 

qman
Constraint Violating Yak Guru

442 Posts

Posted - 2011-08-24 : 09:34:08
I am trying to import 3 CSV files into SQL using SSIS's BULK INSERT task.

Each CSV contains 30 Million records, yes seriously... 30 Million records per CSV.

The package is failing on the BULK INSERT task. Is there a file size limit? Can SSIS and SQL handle this type of request?

I have also tried using a DATA FLOW task. I am able to import about 6 million records before the task blows up....

Any suggestions?

Thanks
Q

robvolk
Most Valuable Yak

15732 Posts

Posted - 2011-08-24 : 09:53:30
I've imported 100+ million rows using bcp and BULK INSERT from a single CSV file, so there's no reason SSIS couldn't handle it.

What exactly is causing the failure? Are you getting an error message? Typical causes are file growth that exceeds the available disk space, or exceeds the limit for either data or transaction log file size (assuming it's not unlimited)
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2011-08-24 : 10:10:20
I think the error is something else, that can be seen from inside BIDS.
The main reason seem to be that a column has wrong attribute, such as too long for the import column, or a datetime value is out of range or similar.

We are importing some 500+ million rows per day from 40+ files (smallest 5 rows and largest file 100+ million rows). It takes about 25 minutes.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2011-08-24 : 10:22:01
Value in a column inappropriate data type? Perhaps you have an extra comma, embedded in the data for a column, that is not "escaped" with quotes - and thus the remainder of the columns in that row are off-by-one?

I normally create a staging table with the appropriate number of columns, all VARCHAR(MAX) datatype, import into that, do some analysis of the columns - e.g. to see if an integer column contains any non-numeric rows:

SELECT ...
FROM MyStagingTable
WHERE MyIntColumn LIKE '%[^0-9]%'

similarly for dates. Also check varchar columns that the MAX(LEN(MyColumn)) does not exceed the defined width.

Once that is all OK you can either repeat the bulk import direct to the table (if you have cleaned up the source data), or insert from your staging table (i.e. after that has been cleaned up, or by ignoring the error-rows)

To ignore the error-rows add an "Error Message" column, and UPDATE that on any error rows, eg.:

UPDATE U
SET [MyErrorMsg] = COALECSE([MyErrorMsg]+'; ', '') + 'MyIntColumn not numeric'
FROM MyTable AS U
WHERE MyIntColumn LIKE '%[^0-9]%'
AND MyIntColumn IS NOT NULL

then you can "import" just the non-error rows:

INSERT INTO ...
SELECT Col1, Col2, ...
FROM MyStagingTable
WHERE MyErrorMsg IS NULL
Go to Top of Page

qman
Constraint Violating Yak Guru

442 Posts

Posted - 2011-08-24 : 11:06:06
SwePeso's suggestion seemed to be my issue. One of the data types was not long enough. So far, it appears to be working. I am just past the 8 million mark. Fingers crossed....
Go to Top of Page
   

- Advertisement -