Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-27 : 04:09:38
|
| Folks,After much research and advice, I'm trying to rewrite a lot of my queries because many of them have functions in the where clause and I realize this causes performance issues. Before I set up the testing that I've done, I want to point out that the datediff function I'm running looks a bit redundant. The program that runs on the database I use has all the dates in int datatype (yyyymmdd) and I can't change it. So the work around is that all the dates need to be converted to an acceptable format for datediff (I realize I've converted to char, but it works better in the environment we use for a few reasons I can clarify later)Anyway, out of the 4 tests I've run, I'm not really understanding the performance difference when I look at the estimated execution plans. If anyone has any pointers as to which of the 4 test queries looks optimal, I'd appreciate it (and "why" if at all possible) I'm thinking the one with the CTE is probably the best performer, but that's also the least practical for the environment (again, will need to clarify that later)Here's some sample data using a table variable:Declare @CDate datetimeset @CDate=GETDATE() --Run on 11-27-10Declare @BadDateFormatTable Table (DateInt Int)insert into @BadDateFormatTable (DateInt)Values (20101119);insert into @BadDateFormatTable (DateInt)Values (20101120);insert into @BadDateFormatTable (DateInt)Values (20101121);insert into @BadDateFormatTable (DateInt)Values (20101122);insert into @BadDateFormatTable (DateInt)Values (20101123);insert into @BadDateFormatTable (DateInt)Values (20101124);insert into @BadDateFormatTable (DateInt)Values (20101125);insert into @BadDateFormatTable (DateInt)Values (20101126);insert into @BadDateFormatTable (DateInt)Values (20101127);--Test1 : Use results of subquery in order to get the product of datediff functionSelect A.BadDateFrom ( Select CONVERT(char(8), dateint,112) as BadDate, DATEDIFF(dd, convert(char(8), dateint,112), @cdate) as DDiff From @BadDateFormatTable) AWhere A.DDiff between 1 and 3;--Test2 : Use the DateDiff function in derived table and join using the datediff to return the desired daysSelect A.DateIntFrom @BadDateFormatTable AJoin (Select Dateint, DATEDIFF(dd, convert(char(8), dateint,112),@cdate) as DDiff From @BadDateFormatTable) B On A.DateInt=B.DateInt and B.DDiff between 1 and 3; --Test3 : Use DateDiff Function in derived table and filter using the calculated ddiff column in where clauseSelect A.DateIntFrom @BadDateFormatTable AJoin (Select Dateint, DATEDIFF(dd, convert(char(8), dateint,112),@cdate) as DDiff From @BadDateFormatTable) B on A.DateInt=B.DateIntWhere B.DDiff between 1 and 3;--Test4 : Use CTE to produce datediffWith DateCte (DateInt, DDiff)As (Select Dateint, DATEDIFF(dd, convert(char(8), dateint,112),@cdate) as DDiff From @BadDateFormatTable)Select DateIntFrom DateCteWhere DDiff between 1 and 3;--Numbers 1 and 3 seem to be just about the same query after reviewing(edit: Punctuation) |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2010-11-27 : 04:12:51
|
| did you try looking into execution times and plan for each?------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-27 : 04:27:39
|
| Hi Vis,I'm not really understanding what the execution plans are saying; I've read a few columns on execution plans, but it's hard for me to understand as I don't know what "good" looks like.This is what I got when I set the statistics time on:Test 1 cpu 0ms elapsed 147msTest 2 cpu 16ms elapsed 166msTest 3 cpu 31ms elapsed 162msTest 4 cpu 0ms elapsed 148msedit: because the original query has so little data and there were literally no times responding from statistics on, I compared to the actual table and set the "between" searches for "Between 1 and 5000" rather than 1 and 3. That's where the stats times came from. Sorry if confusing. |
 |
|
|
Transact Charlie
Master Smack Fu Yak Hacker
3451 Posts |
Posted - 2010-11-27 : 06:07:04
|
| Your test isn't a good model. The main *reason* that you don't want to use functions over columns in the WHERE clause is that it stops you using an index on that column.My advice:1) build a test table (a permanent one) with a large selection of data.2) Put indexes on that table that mirror your production setup3) start testing.statistics time on is not what you should be looking at first. The metrics you want to look at are:1) Execution plan - "show actual execution plan" in the query tooltip. you want to look for "scans" in the execution plan on columns that you have an index on that you want to use. If you were using the index you would see "seeks"2) Statistics IO - with a lot of test data you should get some meaningful data on reads / writes.Charlie===============================================================Msg 3903, Level 16, State 1, Line 1736The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-11-27 : 07:13:26
|
| Do you have to use DATEDIFF?Given that your dates are in yyyymmdd and thus sort in chronological order, can you just doWHERE DateInt >= @MyStartDateInt AND DateInt < @MyStopDateIntFailing that I think you need a "cheaper" conversion to DATETIME datatype than going via CHAR. And all arithmetic formula will be quicker |
 |
|
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-27 : 12:42:26
|
| Kirsten and Charlie, thanks for the feedback. I'll try to answer each of you guys separately:Charlie: I did run this against the "actual" permanent table; there are approximately 7k records which is where the resulting testing times came from. I figured it was easier to show using a short example with a table variable. As for the execution plan, yes, I realize that's a great tool, I just don't get it. I am having a hard time interpreting the results.Kirsten: In the example I gave, yes, I do need to use datediff because it's the foundation for a report in another interface where I can't use user defined parameters. It runs automatically against getdate() each week. But the larger question is that I'm trying to avoid functions in where clause in a large amount of queries. This was just one example. I guess I was hoping to find a better approach (ie, wrapping it in a CTE as a "select" and then putting the resulting computed column into the where clause etc) I have queries that use many other functions in the Where clause and I just wanted to see if putting it into a different place in the query would have any positive performance effect. Thanks for your help...any further feedback is very greatly appreciated! |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-11-27 : 15:05:05
|
| "It runs automatically against getdate() each week"Use GetDate to calculate a Min / Max YYYYMMDD integer value and use those as the Range for your DateInt column?If not clear ask me to clarify pls |
 |
|
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-27 : 15:52:18
|
| Hi Kristen,If you don't mind clarifying, that would be great. What would the Min/Max be used against? Do you have an example? |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-11-27 : 16:11:29
|
"I do need to use datediff because it's the foundation for a report in another interface where I can't use user defined parameters. It runs automatically against getdate() each week."So ... you want, say, the previous 7 days and perhaps you are using DATEDIFF to calculate if DateInt is within 7 days of GetDate()?WHERE DATEDIFF(Day, DateInt, GetDate()) < 7something like that?Instead you could doDECLARE @dtStart int, @dtStop intSELECT @dtStart = CONVERT(int, CONVERT(char(8), DATEADD(Day, -7, GetDate()), 112)), @dtStop = CONVERT(int, CONVERT(char(8), GetDate(), 112))SELECT Col1, Col2, ...FROM MyTableWHERE DateInt >= @dtStart AND DateInt <= @dtStop |
 |
|
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-27 : 16:38:03
|
| Oh, very cool! Do you see any harm in running a bit differently? I can't declare variable in the reports I need to publish, so I need to work around using derived tables (can use CTE's in one of the environments, but not in all, so derived tables are my fall back on a lot of this) :select A.dateintFrom @BadDateFormatTable AJoin ( SELECT StartDt = CONVERT(int, CONVERT(char(8), DATEADD(Day, -7, GetDate()), 112)), EndDt = CONVERT(int, CONVERT(char(8), GetDate(), 112)) ) b on A.DateInt between b.StartDt and b.EndDtAlso, as I've noted, I have a ton of other types of functions that I need to try to pull out of the where clause. Do you think this approach is generally going to work to optimize?For example, if I have to get the datepart for a query:Select SomethingFrom mytableWhere datepart(dw, IntDateCol)=7to return data only for Saturdays. I'm thinking after looking at your example that I could do something like in a simplified format:Declare @date intDeclare @dw intSelect @date = 20101127select @dw=convert(int, datepart(dw, convert(char(8), @date,112)))select @datewhere @dw=7(again, I'll need to convert this to a derived table so I can use it) |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2010-11-28 : 02:36:16
|
| which reporting tool you're using? why cant you use variables? why not then wrap everything into a procedure and call it from report?------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-28 : 02:50:40
|
| Hi Vis,The software we use has a user interface similar to Query Analyzer, but it's limited in its functions. All statements must start with SELECT.I could create a procedure but I'll be losing access to SQL Server for this application soon and thus will not be able to do anything more with it. I'm trying to confine all my query writing in this environment to things I"ll be able to work with as our MIS department is taking access away but they will not be providing any support either.Bottom line is that everything has to be wrapped up in a select statement of some sort. I'm lucky that the sister product which is very similar at least allows Common Table Expressions; however, this one does not. |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-11-28 : 03:34:27
|
" I need to work around using derived tables"I doubt you even need a derived table, this should be fine:WHERE DateInt >= CONVERT(int, CONVERT(char(8), DATEADD(Day, -7, GetDate()), 112))I am pretty sure that SQL will only evaluate the right-hand-side once and then use it as a CONSTANT in the query plan."The software we use has a user interface similar to Query Analyzer, but it's limited in its functions. All statements must start with SELECT"How do purchasing decisions for such things ever get passed? The wining-and-dining route I expect |
 |
|
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-28 : 03:46:00
|
| Hey Kristen,couple things...that sounds great...I didn't realize that the function only eats at performance if it's on the "left side" of the operator. So that's good to know.As for the purchasing decision, believe it or not, the software we use is actually very good for what it does. I have a love/hate relationship with the company (and we purchased it long before I was in my current position). The issue really is that the company is enormous, has swallowed up tons of smaller companies and has really had issues with keeping code consistent (tons and tons of legacy columns that cause problems etc) Anyway, they were running into security issues years ago with users mucking up their databases with the query writer, so they limited it to select statements. Technically, the on-site technical admin person can use SQL server with all its bells and whistles. Unfortunately, that's more of an internal thing, the municipality I work for is basically shutting anything they can down without providing the help we need to do our jobs. So I have to do lots and lots of workarounds :)Thanks for your help! |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-11-28 : 14:00:43
|
| "I didn't realize that the function only eats at performance if it's on the "left side" of the operator"You have probably got it, but its not "left side", it is whether the function acts on a column in the tables - i.e. something that varies from row to row. Anything that is a Constant, for the duration of the query, should be only calculated once (unless you manage to fool the Query Planner into not realising that it won't change!)If you do WHERE MyColumn > 'xxx'then SQL can any available Index on MyColumnLikewise if you doWHERE MyColumn > 'xxx' + 'yyy'orWHERE 'xxx' + 'yyy' < MyColumnbut if you doWHERE SomeFunction(MyColumn) > 'xxx'then indexes are off the menu! (SQL may be smart enough to use an index in some cases, I don't know, but I avoid this scenario) |
 |
|
|
X002548
Not Just a Number
15586 Posts |
|
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-28 : 15:57:08
|
| Ok Kristen, I think I got it. So, going back to your earlier example:WHERE DateInt >= CONVERT(int, CONVERT(char(8), DATEADD(Day, -7, GetDate()), 112)), if I were to change getdate to a particular column in the table such as:WHERE DateInt >= CONVERT(int, CONVERT(char(8), DATEADD(Day, -7, SomeDateField), 112))Then it would likely have the same effect of negating SQL Server's ability to run an index against that column? (the somedatefield)Also, it's looking like I probably have to figure out a different way to run searches using functions. For example:Where Datepart(dw, mydatefield)=2My question is, if I take the function out of the where clause and put it somewhere else like in the select statement of a derived table, then use the end result for the where search, is that getting around the where clause limitation?EX:Select Datepart(dw, t1.datefield), Col2From Table1 T1JOIN (Select Datepart(dw, DateField) DF2 From Table2) T2 on T2.DF2=Datepart(dw, t1.datefield)Is putting the function into the ON clause the same as putting a function into the WHere clause? LIkewise:Select Datepart(dw, t1.datefield), Col2From Table1 T1CROSS JOIN (Select Datepart(dw, DateField) DF2 From Table2) T2 WHERE T2.DF2=7Is the fact that I'm using the product of the datepart function in the derived table in where clause of the whole query the same effect? |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-11-29 : 02:15:56
|
| "non sargable"That's the word, thanks Brett!"Where Datepart(dw, mydatefield)=2"What period of time is this over? If it is 4 weeks, say, you could just test against 4 specific date valuesWhere mydatefield IN (20101129, 20101122, 20101115, , 20101108)(each of which you could calculate from GetDate() and should be treated as Constant values in the query)"Is putting the function into the ON clause the same as putting a function into the WHere clause?"Yes, SQL won't use an INDEX for the JOIN in that circumstanceBut you are on the right track, if you can get the actual dates for the day-of-week into a derived table (or just an "IN" as I mentioned above) that should help. |
 |
|
|
flamblaster
Constraint Violating Yak Guru
384 Posts |
Posted - 2010-11-29 : 02:47:38
|
| Kristen, thanks for sticking with me...this is helping a lot! Ok, so the date range would probably be impractical for the majority of the things I'm trying to run to use an "IN" clause. I have to run reports to see "every Monday (or other day of week)" over a 2 - 3 year period sometimes to look for trends etc.So probably something like I mentioned earlier such as:Select Datepart(dw, t1.datefield), Col2From Table1 T1CROSS JOIN (Select Datepart(dw, DateField) DF2From Table2) T2 WHERE T2.DF2=7would be the best route for the environment I work in. I've also seem some things on using functions in cross apply in order to get around the performance issues, or am I mucking that up? |
 |
|
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2010-11-29 : 04:22:18
|
quote: Originally posted by flamblaster The program that runs on the database I use has all the dates in int datatype (yyyymmdd) and I can't change it.
1. Fire the designer who decided for this all-in-the-world-reasons approach2. Add a calculated column with this piece of codeALTER TABLE BadDateFormatTableADD MyDate AS (CAST(CAST(DateInt AS CHAR(8)) AS DATETIME)) 3. Now, with a correct datatype, you can do whatever calculations you want.For more information about DATEDIFF, see here http://www.sqlteam.com/article/datediff-function-demystified N 56°04'39.26"E 12°55'05.63" |
 |
|
|
Kristen
Test
22859 Posts |
Posted - 2010-11-29 : 07:04:09
|
Is that calculated column maintained as data in the row then? I thought it was just calculated at "use"??But, yeah, there is no excuse for storing a date in a non-date column, that's clear to see from the problems it is causing for flamblaster - let alone all the others I can think of.I wonder if there is any invalid yyyymmdd values in that INT column? " I have to run reports to see "every Monday (or other day of week)" over a 2 - 3 year period sometimes"You could use a Tally Table to generate those in the derived table. I haven't got time at the mo to explain that, hopefully Google or someone else can. Basically you need a sequence of numbers 0, 7, 14, ... which you add to your report starting date to get the day-of-week values, and then you can JOIN that to your yyyymmdd INT columnChanging the crappy database design would be better though ... |
 |
|
|
Next Page
|
|
|
|
|