Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
sura
Starting Member
10 Posts |
Posted - 2011-07-03 : 21:57:24
|
| Hi there,I am from ETL background, and not so good about SQL Server parallelism capability. But I am in the position to compare merits and demerits of SQL vs Datastage (ETL tool from IBM) approach. In ETL I can create 100 jobs which will run almost at the same time and finish. So all the jobs are going to run in parallel. But if I want to do it in SQL server I am not sure how this can be possible!!What I need to know is, how efficient SQL Server is to handle the below given scenario?OS WindowsDB SQL Server 2008 R2Example 1Pull the data from SQL Server DB and load into SQL Server DB (source to target)I have 100 tables from the source SQL Server DB. I need to pull the data from source, find the changes (delta) by using check sum based on Key columns and load the delta data (newly changed data) into the target SQL Server table, using SQL Server procedures. Example 2Pull the data from Postgre SQL DBI have 110 tables from Postgres SQL DB. I need to pull the data, find the changes (delta) by using check sum based on Key and load the delta data (newly changed data) into the target SQL Server table, using SQL Server procedures. Example 3Pull the data from Oracle DBI have 75 tables from Oracle DB. I need to pull the data, find the changes (delta) by using check sum based on Key and load the delta data (newly changed data) into the target SQL Server table, using SQL Server procedures. Questions: 1. Did the SQL Server will pull (Consider all the example) the tables in Sequence (take the first table, load and then another...) or is it capable to run in parallel?1a. If it is capable, how to find, how many tables it is pulling at a time?2. How efficient SQL Server is to pull the data other than SQL Server DB, change the data types (Consider date and other data types) and load into SQL Server?It would be great if any one comment in this.Thanks in advance,SQL Server Starter |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-07-04 : 04:55:58
|
| Your trying to compare apples with oranges here. SQL Server is a *database engine* and not an ETL tool. SQL Server does have an ETL tool called SQL Server Integration Services but I have no idea how that compares with IBMs tools. When it comes to your questions of running things in parallel sql server is very capable, but getting things to run in parallel without using SSIS is cumbersome. When a procedure is executed it connects to the database engine and everything that procedure does it does in sequence using the same connection (with some exceptions). Getting several procedures to run in parallel means you have to open several different connections at the same time and the easiest way of doing this is to use SSIS or to chedule the procedures to run at the same time using SQL Server Agent. Regarding the efficiency of ETL transformations you'll basically have to try. But SQL Server does hold the current world record for moving data through an ETL process -> http://blogs.msdn.com/b/sqlperf/archive/2008/02/27/etl-world-record.aspx- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
 |
|
|
sura
Starting Member
10 Posts |
Posted - 2011-07-05 : 21:26:16
|
| Thanks for the reply.Sorry, I know that I am trying to compare apple with orange. But sometime need to compare the efficiency the way how these two will handle the same situation. This will help to choose the best approach to rech the goal.Again Thanks for the comments.Cheers, |
 |
|
|
nigelrivett
Master Smack Fu Yak Hacker
3385 Posts |
Posted - 2011-07-05 : 22:16:56
|
| Don't know about 2 but for example 1 and 3 it will probably depend more on the experience of the implementer than the product.Also efficiency probably isn't the most important factor - cost of implementation and mainenance would outweigh it - these would both be higher to get the same performance with less experienced staff, sometimes orders of magnitude.That's without considering network and disk configurations.==========================================Cursors are useful if you don't know sql.SSIS can be used in a similar way.Beer is not cold and it isn't fizzy. |
 |
|
|
|
|
|
|
|