Author |
Topic |
rockstar283
Yak Posting Veteran
96 Posts |
Posted - 2011-05-30 : 03:33:44
|
I have a table in SQL say 'X'(i primary key int,j int, dt datetime) and it contains some data. The initial load of data is fine but I am confused about how to handle the incremental data. i.e. suppose initially X contains some data 1,2, 12-22-2010 3,4, 06-12-2011and the next time the following data is coming into X1,2,12-22-2010 3,4, 06-12-2011 5,6, 07-01-2011 3,7, 10-10-2011so the X shall become,1,2, 12-22-2010 3,7, 10-10-2011 5,6, 07-01-2011can somebody please tell me how to handle the above situation using SSIS using different transformations and using pure sql? |
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-05-30 : 06:48:29
|
If it's purely incremental (there will be no gaps in the PK) you can do like this:1. Create an execute sql task and run select max(ID) from table on the TARGET database2. In this execute sql task -> general, set the result set as Single Row. Then in Result set in the left side meny, map the result set to the user variable3. Create a data flow task and add an ole db source with a sql query to select data from your source table and add a filter to your query like this "WHERE ID > ?"4. Click Parameters and map the user variable to Parameter0- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
|
|
rockstar283
Yak Posting Veteran
96 Posts |
Posted - 2011-05-30 : 19:44:48
|
That will work fine only when there are completely new records in the incoming load..but what if the load contains the updated values of already existing records..eg. if u see in the above example, 2nd row had 3,4, 06-12-2011 but after the new load which contains new value for 3 i.e.3,7, 10-10-2011..how will you achieve this? |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-05-31 : 01:24:07
|
Well, you said incremental...a table with changing rows is not incremental. In that case you should add a datetime column to the table that says when the record was last updated/inserted Then you need to add a data source query that gets data that has a greater timestamp than X. Then update or insert based on a lookup to check if the record exists or not.- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-05-31 : 01:24:58
|
Well, you said incremental...a table with changing rows is not incremental. In that case you should add a datetime column to the table that says when the record was last updated/inserted Then you need to add a data source query that gets data that has a greater timestamp than X. Then update or insert based on a lookup to check if the record exists or not.- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
|
|
rockstar283
Yak Posting Veteran
96 Posts |
Posted - 2011-05-31 : 01:47:45
|
ok..so I take an create_dt(date when the record was created for the first time) and update_dt(Date when the record is updated) columns for the load..so when the next load comes..I first retrieve the max(update_dt) from the original source and consider only records from the incoming load whose create_dt is greater than the max(update_dt) we have..am I right?but what if the incoming load does not have a create_dt or any date associated with them? |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-05-31 : 02:19:23
|
Yes, you're right and I forgot to write this in my previous post but this is known as a "delta load".The concept is really simple actually; you basically try to the best of your ability to transfer as little data as possible without compromising data quality. If it's not possible to identify what data has been inserted/updated/deleted in the source system you have to do a truncload (truncate table / full load) which is perfectly normal especially when loading data from legacy systems. If we're talking small amounts of data (less than a few 100k rows per day) a truncload isn't all that bad. If we're talking several million rows you might want to put a little more effort into identifying the delta.- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
|
|
rockstar283
Yak Posting Veteran
96 Posts |
Posted - 2011-05-31 : 02:57:14
|
hey Thanks a lot mate..I didnt even know its name.. :) You have been of a great help..Thanks again :)) |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-05-31 : 03:16:22
|
Excellent :)- LumbagoMy blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/ |
|
|
|