Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SSIS and Import/Export (2005)
 Incremental load using SSIS

Author  Topic 

rockstar283
Yak Posting Veteran

96 Posts

Posted - 2011-05-30 : 03:33:44
I have a table in SQL say 'X'(i primary key int,j int, dt datetime) and it contains some data. The initial load of data is fine but I am confused about how to handle the incremental data. i.e. suppose initially X contains some data

1,2, 12-22-2010
3,4, 06-12-2011

and the next time the following data is coming into X

1,2,12-22-2010 3,4, 06-12-2011
5,6, 07-01-2011
3,7, 10-10-2011

so the X shall become,

1,2, 12-22-2010
3,7, 10-10-2011
5,6, 07-01-2011

can somebody please tell me how to handle the above situation using SSIS using different transformations and using pure sql?

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-05-30 : 06:48:29
If it's purely incremental (there will be no gaps in the PK) you can do like this:

1. Create an execute sql task and run select max(ID) from table on the TARGET database
2. In this execute sql task -> general, set the result set as Single Row. Then in Result set in the left side meny, map the result set to the user variable
3. Create a data flow task and add an ole db source with a sql query to select data from your source table and add a filter to your query like this "WHERE ID > ?"
4. Click Parameters and map the user variable to Parameter0


- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

rockstar283
Yak Posting Veteran

96 Posts

Posted - 2011-05-30 : 19:44:48
That will work fine only when there are completely new records in the incoming load..but what if the load contains the updated values of already existing records..eg. if u see in the above example, 2nd row had
3,4, 06-12-2011 but after the new load which contains new value for 3 i.e.3,7, 10-10-2011..how will you achieve this?
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-05-31 : 01:24:07
Well, you said incremental...a table with changing rows is not incremental. In that case you should add a datetime column to the table that says when the record was last updated/inserted Then you need to add a data source query that gets data that has a greater timestamp than X. Then update or insert based on a lookup to check if the record exists or not.

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-05-31 : 01:24:58
Well, you said incremental...a table with changing rows is not incremental. In that case you should add a datetime column to the table that says when the record was last updated/inserted Then you need to add a data source query that gets data that has a greater timestamp than X. Then update or insert based on a lookup to check if the record exists or not.

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

rockstar283
Yak Posting Veteran

96 Posts

Posted - 2011-05-31 : 01:47:45
ok..so I take an create_dt(date when the record was created for the first time) and update_dt(Date when the record is updated) columns for the load..so when the next load comes..I first retrieve the max(update_dt) from the original source and consider only records from the incoming load whose create_dt is greater than the max(update_dt) we have..am I right?
but what if the incoming load does not have a create_dt or any date associated with them?
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-05-31 : 02:19:23
Yes, you're right and I forgot to write this in my previous post but this is known as a "delta load".

The concept is really simple actually; you basically try to the best of your ability to transfer as little data as possible without compromising data quality. If it's not possible to identify what data has been inserted/updated/deleted in the source system you have to do a truncload (truncate table / full load) which is perfectly normal especially when loading data from legacy systems. If we're talking small amounts of data (less than a few 100k rows per day) a truncload isn't all that bad. If we're talking several million rows you might want to put a little more effort into identifying the delta.

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page

rockstar283
Yak Posting Veteran

96 Posts

Posted - 2011-05-31 : 02:57:14
hey Thanks a lot mate..I didnt even know its name.. :) You have been of a great help..Thanks again :))
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-05-31 : 03:16:22
Excellent :)

- Lumbago
My blog-> http://thefirstsql.com/2011/02/07/regular-expressions-advanced-string-matching-and-new-split-function-sql-server-2008-r2/
Go to Top of Page
   

- Advertisement -