Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SSIS and Import/Export (2005)
 get data using openquery() from mysql to sql 2005

Author  Topic 

bigbelly
Starting Member

39 Posts

Posted - 2007-10-27 : 02:01:44
I'm trying to use linked server to import big amount of data(around 9 million rows) from mysql to a sql 2005 table.

I plan to create a stored procedure and use job agent to run it at off-peak time everyday. The SP is like:

.....

Truncate table sqltblA;

Insert Into sqltblA(col1, col2, col3)
Select col1, col2, col3
From OpenQuery('mysql', 'Select col1, col2, col3 From mytblB')

.....

But when I tried to CREATE the above SP from management studio, seems the sql server was trying to get all data from table mytblB and ended up with failure after several minutes hang. The error message is sth. like "running out memeory". BTW, during that several minutes hang, I can't query mytblB using mysql's tool as well. it seems to me that table got locked.



However if i try to change the returned result to 100,000 rows by changing script to

Insert Into sqltblA(col1, col2, col3)
Select col1, col2, col3
From OpenQuery('mysql', 'Select col1, col2, col3
From mytblB Limit 100000')

The SP could be created without any problem and running the SP could get the data in table sqltblA.

But that's not what I want, I need all rows instead of a certain amount of rows.

Does anyone know what's going on there when I try to CREATE that SP and any solution to it?


Plus I tried to use SSIS as well. When I'm configuring the DataReader source, which uses ADO.NET's ODBC option to connect to mysql, if its sqlcommand is "select * from mytblB" without the Limit key word, then the configuration page will hang there as well and table mytblB is not accessible by using mysql tool. I'm just sick of wasting time on that and that's why I chose to use SP istead of SSIS. But same result. :-(

Kristen
Test

22859 Posts

Posted - 2007-10-27 : 03:58:14
I would expect the SSIS route to be better, because you can more easily control the batch size.

I doubt you are going to have success doing a SELECT ... FROM ... on 9 million rows via this route without some sort of batch size or loop, or some hints to both ends to just "ship the data" without doing any batching etc.

"use job agent to run it at off-peak time everyday"

Is this the same 9 million rows every day (including some that have changed)? or a different 9 million rows?

Because if it a dataset that is largely the same you need to find a way to only ship the ones have have changed, or changed recently, which will then give you a more controlled process.

What is the speed of the link between MySQL and MSSQL like? Is it a LAN or across the Web?

We have a LAN connection between a high end MSSQL and a very high end Oracle box. That would take "the best part of a long time" to shift 9 million rows in one go.

Kristen
Go to Top of Page

bigbelly
Starting Member

39 Posts

Posted - 2007-10-27 : 11:36:03
Thanks kristen.

The data on mysql table is holding voice product. So its data is changing frequently and not fixed to a certain number of rows.

2 servers are in different buildings but connection's bandwidth is enough.

Sorry, I didn't give the exact error message in the first place. The error is:

OLE DB provider "MSDASQL" for linked server "mysql" returned message "[MySQL][ODBC 3.51 Driver][mysqld-4.0.22-standard-log]MySQL client run out of memory". (.Net SqlClient Data Provider)

Does this error message indicate it's server with mysql running out of memory OR the server with sql server running out of memory?

Server with SQL Server has 2 GB RAM and SQL server version is:
Microsoft SQL Server 2005 - 9.00.2047.00 (Intel X86) Apr 14 2006 01:12:25 Copyright (c) 1988-2005 Microsoft Corporation Standard Edition on Windows NT 5.2 (Build 3790: Service Pack 2)

I know that I could do it by batching. Looping a job which grabs a smaller number of rows each time. Actually what I want to know most is that so far I'm only trying to create a SP. I haven't got a chance to run it to import data yet. Why it's running out memory already? It seems to me when parsing the query "Select col1, col2, col3 From OpenQuery('mysql', 'Select col1, col2, col3 From mytblB'), the system actually was going through those 9 million rows. Can anyone explain what's going on internally to me? Thanks!
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-27 : 11:42:55
"Does this error message indicate it's server with mysql running out of memory OR the server with sql server running out of memory?"

Reads to me like its the MySQL end.

"Why it's running out memory already?"

because 9 millions rows with no flow control is a lot! Particularly if the table contains large column(s) - such as TEXT, IMAGE, BLOB etc.

You need a more sophisticated way to transfer that much data rather than just INSERT INTO ... SELECT ... FROM. SSIS will give you that sophistication (e.g. it will control batch size) with less effort than building a custom Sproc. Unless these are 9 million NEW records every time you ought to consider doing something at the MySQL end so that you know what has changed and work on only transferring the New/Deleted/Changed records each time.

Kristen
Go to Top of Page

bigbelly
Starting Member

39 Posts

Posted - 2007-10-27 : 13:31:07
Thanks Kristen for your prompt reply.

The mysql database is a replication of our voice switch database. We only have readonly permission to access it. So I can't do much about it to find out what rows are New/Changed on mysql side. BTW, everyday most rows' data would got changed (by looking at columns' names, such as: balance, status, firstusedate, lastusedate, lastrechargedate...., you will know what I mean)

I'll look into SSIS more to find out a better solution.

I understand that data of 9 million rows is a lot. But my puzzle is I'm just creating SP at the moment. Does sql server need to go through all those data by the time when it only parses the query? It's easy for me to understand if system tells me "running out of memory" when I'm trying to run that SP.

Same thing with SSIS, as I mentioned in my first post: when I'm trying to configure SSIS's DataReader source, if I don't put that "limit" key word in its sqlcommand. It will also hang there, which seems SSIS was trying to load/go through(i don't know which word to use to describe that action) that 9 million rows by that time. Could you please explain what sql server is happening at that moment time? Thanks!
Go to Top of Page

rmiao
Master Smack Fu Yak Hacker

7266 Posts

Posted - 2007-10-27 : 13:51:53
Or dump mysql data to text file then bulk insert them to sql table?
Go to Top of Page

bigbelly
Starting Member

39 Posts

Posted - 2007-10-30 : 02:25:48
Could anyone give me some clues about how to control batch size in SSIS as mentioned by Kristen? Do I only need to set the "rows per batch" in the "OLE DB Destination"? Do I need to do anything on "DataReader Source"? Thanks!
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-10-30 : 03:05:55
I don't know in SSIS.

In DTS (SQL 2000) you could use the tools to point-and-click on Source and Destination, "save as a package", and then edit the package; on the visual diagram you could then right-click on the "pipe" between Source & Destination Connections and in the Options was the "INSERT batch size".

Dunno where that is in SQL 2005 though

Kristen
Go to Top of Page

rishikesh1975
Starting Member

1 Post

Posted - 2007-11-01 : 15:43:49
You can continue to insert the records from the linked server. but you need to restrict the number of records say 1000 by limiting the rows fetch like
[
Set rowcount 1000
Insert Into sqltblA(col1, col2, col3)
Select col1, col2, col3
From OpenQuery('mysql', 'Select col1, col2, col3 From mytblB')
set @outparam = @@rowcount
]

the truncate command

Truncate table sqltblA;

should be executed once

and the set of commands mentioned above should be put inside a procedure and called repeatedly with output parameter, till the output parameter becomes zero

but make sure that during each call the inserted rows are committed

Go to Top of Page

bigbelly
Starting Member

39 Posts

Posted - 2007-11-12 : 08:19:56
Hi rishikesh1975, your solution won't be working for my case. Your solution will keep grabbing the first 1000 rows from mysql and it's an endless loop.

Your approach is like "controlled delete"(
http://weblogs.sqlteam.com/tarad/archive/2003/10/15/305.aspx) and I'm searching for a way of "controlled insert".

For my case, is there a way to insert certain number of rows at a time and commit in sql server if the mysql source table doesn't have a ID column? Thanks!

Go to Top of Page

adecicco
Starting Member

1 Post

Posted - 2008-12-30 : 11:59:52
I'm sure you've long since moved on, but it's your driver that's out of memory. Go to the DSN you're using for the linked server and select the "Allow big results" option.
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-12-30 : 12:20:08
quote:
Originally posted by adecicco

I'm sure you've long since moved on, but it's your driver that's out of memory. Go to the DSN you're using for the linked server and select the "Allow big results" option.


solution after 1 year
Go to Top of Page
   

- Advertisement -