Author |
Topic |
bigbelly
Starting Member
39 Posts |
Posted - 2007-10-27 : 02:01:44
|
I'm trying to use linked server to import big amount of data(around 9 million rows) from mysql to a sql 2005 table. I plan to create a stored procedure and use job agent to run it at off-peak time everyday. The SP is like: .....Truncate table sqltblA;Insert Into sqltblA(col1, col2, col3)Select col1, col2, col3 From OpenQuery('mysql', 'Select col1, col2, col3 From mytblB').....But when I tried to CREATE the above SP from management studio, seems the sql server was trying to get all data from table mytblB and ended up with failure after several minutes hang. The error message is sth. like "running out memeory". BTW, during that several minutes hang, I can't query mytblB using mysql's tool as well. it seems to me that table got locked. However if i try to change the returned result to 100,000 rows by changing script to Insert Into sqltblA(col1, col2, col3)Select col1, col2, col3 From OpenQuery('mysql', 'Select col1, col2, col3 From mytblB Limit 100000')The SP could be created without any problem and running the SP could get the data in table sqltblA. But that's not what I want, I need all rows instead of a certain amount of rows. Does anyone know what's going on there when I try to CREATE that SP and any solution to it? Plus I tried to use SSIS as well. When I'm configuring the DataReader source, which uses ADO.NET's ODBC option to connect to mysql, if its sqlcommand is "select * from mytblB" without the Limit key word, then the configuration page will hang there as well and table mytblB is not accessible by using mysql tool. I'm just sick of wasting time on that and that's why I chose to use SP istead of SSIS. But same result. :-( |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-27 : 03:58:14
|
I would expect the SSIS route to be better, because you can more easily control the batch size.I doubt you are going to have success doing a SELECT ... FROM ... on 9 million rows via this route without some sort of batch size or loop, or some hints to both ends to just "ship the data" without doing any batching etc."use job agent to run it at off-peak time everyday"Is this the same 9 million rows every day (including some that have changed)? or a different 9 million rows?Because if it a dataset that is largely the same you need to find a way to only ship the ones have have changed, or changed recently, which will then give you a more controlled process.What is the speed of the link between MySQL and MSSQL like? Is it a LAN or across the Web?We have a LAN connection between a high end MSSQL and a very high end Oracle box. That would take "the best part of a long time" to shift 9 million rows in one go.Kristen |
 |
|
bigbelly
Starting Member
39 Posts |
Posted - 2007-10-27 : 11:36:03
|
Thanks kristen. The data on mysql table is holding voice product. So its data is changing frequently and not fixed to a certain number of rows.2 servers are in different buildings but connection's bandwidth is enough.Sorry, I didn't give the exact error message in the first place. The error is:OLE DB provider "MSDASQL" for linked server "mysql" returned message "[MySQL][ODBC 3.51 Driver][mysqld-4.0.22-standard-log]MySQL client run out of memory". (.Net SqlClient Data Provider) Does this error message indicate it's server with mysql running out of memory OR the server with sql server running out of memory?Server with SQL Server has 2 GB RAM and SQL server version is:Microsoft SQL Server 2005 - 9.00.2047.00 (Intel X86) Apr 14 2006 01:12:25 Copyright (c) 1988-2005 Microsoft Corporation Standard Edition on Windows NT 5.2 (Build 3790: Service Pack 2) I know that I could do it by batching. Looping a job which grabs a smaller number of rows each time. Actually what I want to know most is that so far I'm only trying to create a SP. I haven't got a chance to run it to import data yet. Why it's running out memory already? It seems to me when parsing the query "Select col1, col2, col3 From OpenQuery('mysql', 'Select col1, col2, col3 From mytblB'), the system actually was going through those 9 million rows. Can anyone explain what's going on internally to me? Thanks! |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-27 : 11:42:55
|
"Does this error message indicate it's server with mysql running out of memory OR the server with sql server running out of memory?"Reads to me like its the MySQL end."Why it's running out memory already?"because 9 millions rows with no flow control is a lot! Particularly if the table contains large column(s) - such as TEXT, IMAGE, BLOB etc.You need a more sophisticated way to transfer that much data rather than just INSERT INTO ... SELECT ... FROM. SSIS will give you that sophistication (e.g. it will control batch size) with less effort than building a custom Sproc. Unless these are 9 million NEW records every time you ought to consider doing something at the MySQL end so that you know what has changed and work on only transferring the New/Deleted/Changed records each time.Kristen |
 |
|
bigbelly
Starting Member
39 Posts |
Posted - 2007-10-27 : 13:31:07
|
Thanks Kristen for your prompt reply. The mysql database is a replication of our voice switch database. We only have readonly permission to access it. So I can't do much about it to find out what rows are New/Changed on mysql side. BTW, everyday most rows' data would got changed (by looking at columns' names, such as: balance, status, firstusedate, lastusedate, lastrechargedate...., you will know what I mean)I'll look into SSIS more to find out a better solution.I understand that data of 9 million rows is a lot. But my puzzle is I'm just creating SP at the moment. Does sql server need to go through all those data by the time when it only parses the query? It's easy for me to understand if system tells me "running out of memory" when I'm trying to run that SP. Same thing with SSIS, as I mentioned in my first post: when I'm trying to configure SSIS's DataReader source, if I don't put that "limit" key word in its sqlcommand. It will also hang there, which seems SSIS was trying to load/go through(i don't know which word to use to describe that action) that 9 million rows by that time. Could you please explain what sql server is happening at that moment time? Thanks! |
 |
|
rmiao
Master Smack Fu Yak Hacker
7266 Posts |
Posted - 2007-10-27 : 13:51:53
|
Or dump mysql data to text file then bulk insert them to sql table? |
 |
|
bigbelly
Starting Member
39 Posts |
Posted - 2007-10-30 : 02:25:48
|
Could anyone give me some clues about how to control batch size in SSIS as mentioned by Kristen? Do I only need to set the "rows per batch" in the "OLE DB Destination"? Do I need to do anything on "DataReader Source"? Thanks! |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-30 : 03:05:55
|
I don't know in SSIS.In DTS (SQL 2000) you could use the tools to point-and-click on Source and Destination, "save as a package", and then edit the package; on the visual diagram you could then right-click on the "pipe" between Source & Destination Connections and in the Options was the "INSERT batch size".Dunno where that is in SQL 2005 though Kristen |
 |
|
rishikesh1975
Starting Member
1 Post |
Posted - 2007-11-01 : 15:43:49
|
You can continue to insert the records from the linked server. but you need to restrict the number of records say 1000 by limiting the rows fetch like[Set rowcount 1000Insert Into sqltblA(col1, col2, col3)Select col1, col2, col3From OpenQuery('mysql', 'Select col1, col2, col3 From mytblB')set @outparam = @@rowcount]the truncate command Truncate table sqltblA;should be executed onceand the set of commands mentioned above should be put inside a procedure and called repeatedly with output parameter, till the output parameter becomes zerobut make sure that during each call the inserted rows are committed |
 |
|
bigbelly
Starting Member
39 Posts |
Posted - 2007-11-12 : 08:19:56
|
Hi rishikesh1975, your solution won't be working for my case. Your solution will keep grabbing the first 1000 rows from mysql and it's an endless loop. Your approach is like "controlled delete"(http://weblogs.sqlteam.com/tarad/archive/2003/10/15/305.aspx) and I'm searching for a way of "controlled insert". For my case, is there a way to insert certain number of rows at a time and commit in sql server if the mysql source table doesn't have a ID column? Thanks! |
 |
|
adecicco
Starting Member
1 Post |
Posted - 2008-12-30 : 11:59:52
|
I'm sure you've long since moved on, but it's your driver that's out of memory. Go to the DSN you're using for the linked server and select the "Allow big results" option. |
 |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2008-12-30 : 12:20:08
|
quote: Originally posted by adecicco I'm sure you've long since moved on, but it's your driver that's out of memory. Go to the DSN you're using for the linked server and select the "Allow big results" option.
solution after 1 year |
 |
|
|