Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
sql_er
Constraint Violating Yak Guru
267 Posts |
Posted - 2013-07-24 : 13:32:06
|
Hi guys,I've been given an update query recently and told that it takes hours/days to run. This update query joins 2 tables on a few columns and updates 2 columns. It also has a WHERE clause.Table 1 has ~ 350 million records and is 280 GB in size. It is partitioned by date rangesTable 2 has only 11,000 records (tiny)The query updates Table 1.Here is the puzzle: When I update TOP (200) rows [or less] in Table 1, it takes a split of a second, but when I update TOP (300) rows or more, it runs for 5 minutes.I tried it many times and I get the same results.Any idea as to why the timing jumps so much when I increase the required output from 200 to 300?Please advise.Thanks in advance! |
|
James K
Master Smack Fu Yak Hacker
3873 Posts |
Posted - 2013-07-24 : 13:45:54
|
Look at the query plan and statistics. Turn on query plan (control-m) and run SET STATISTICS IO ON. Then run the two queries and examine the query plans. |
|
|
sql_er
Constraint Violating Yak Guru
267 Posts |
Posted - 2013-07-24 : 13:51:58
|
98% of the time/cost is spent on scanning this huge table, which is a heap.It is scanning because the conditions in the where clause can't use indexes (due to isnull,ltrim,rtrim functions being applied to them on join)Should i add a clustered index? It might be a big task considering the size of the table (280 GB)Thank you! |
|
|
James K
Master Smack Fu Yak Hacker
3873 Posts |
Posted - 2013-07-24 : 14:34:22
|
If it is scanning without using the (non-clustered) index that you already have because of non-sargability in the where clause. If that is the case, adding a clustered index is not going to help. If it is the three things that you indicated - isnull, ltrim, rtrim - it may be possible to rewrite the query so you don't use those functions.Also, as you correctly guessed, you have to be careful about adding a clustered index on a large table because when you do that it has to reorganize all the data in the table, which can be time-consuming. And, it will take up disk space as well (logged operation)Yep, I used a big word "sargability", only because I needed a lame reason to post a link to this blog http://myshallowsqlblog.wordpress.com/saragable-the-word-only-a-geek-could-love/ |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2013-07-24 : 15:01:09
|
There seem to be an internal threshold for about 220 rows.Do the update and increase the 200 to 201, 202 etc until you find the threshold. Microsoft SQL Server MVP, MCT, MCSE, MCSA, MCP, MCITP, MCTS, MCDBA |
|
|
sql_er
Constraint Violating Yak Guru
267 Posts |
Posted - 2013-07-24 : 16:54:03
|
James K. - thanks for the info. I'll try to think of a way to rewrite a query not to use isnull, ltrim, rtrim.SwePeso - what do you mean by an internal threshold?Btw, I started taking advantage of the partitions by adding date ranges to the update query to be in line with the partition function specifications and that made a world of difference. I'm now able to update 5 million records in under 5 minutes.Thanks! |
|
|
wowguide
Starting Member
3 Posts |
Posted - 2013-08-04 : 03:50:51
|
unspammed |
|
|
|
|
|
|
|