Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 What resources would increase SQL performance?

Author  Topic 

ahouse
Starting Member

27 Posts

Posted - 2009-02-11 : 11:00:18
I am joining three tables and then inserting the results into a new table. Sometimes the join statement produces a couple hundred rows (which runs quickly) but on occassion it produces around a million rows. (which doesn't run as quickly)

The inserting of a million rows is sometimes unavoidable and the result set is correct.

I am going to purchase a new server to exclusively run this sql server query but I am unsure what resources SQL Server 2000 uses to perform the calculations. I.E. would 16 GB of RAM decrease the process time? Faster Processors? Faster Hard drives/Raid 10 configurations?

The query is listed below if there is something seriously wrong with the way I am joining but I think this is more of a processing power issue than a query optimization problem.

BusinessMaster table contains 18 million records
ZipCodeMaster table contains 45,000 records
NewBusinessTemp table contains roughly 2000 records

I'm joining NewBusinessTemp to ZipCodeMaster on a dummy row called Filler where the values are always 1 and then using the Where statement to drill down to which ZipCodes I want to use.
Then Joining my ZipCodeMaster to BusinessMaster on this ZipCode dataset and using Where to compare the Business Names to eachother.

(Essentially trying to find a matching business name within a 35 mile radius of the zip code provided in NewBusinessTemp)



Select BusinessMaster.CompanyName, BusinessMaster.City, NewBusinessTemp.BusinessName, NewBusinessTemp.Account, ZipCodeMaster.State
From (NewBusinessTemp inner join ZipCodeMaster On NewBusinessTemp.Filler = ZipCodeMaster.Filler) Inner Join BusinessMaster
On BusinessMaster.Zip = ZipCodeMaster.ZipCode
Where ((dbo.CalculateDistance(NewBusinessTemp.Latitude, NewBusinessTemp.Longitude, ZipCodeMaster.Latitude, ZipCodeMaster.Longitude, 'Miles') < 35)
And
(dbo.CompareText(NewBusinessTemp.BusinessName, BusinessMaster.CompanyName) > 80)

dinakar
Master Smack Fu Yak Hacker

2507 Posts

Posted - 2009-02-11 : 12:37:44
have you considered other options - if you can move this to off hours? or use BCP ? or SSIS?

Dinakar Nethi
************************
Life is short. Enjoy it.
************************
http://weblogs.sqlteam.com/dinakar/
Go to Top of Page

ahouse
Starting Member

27 Posts

Posted - 2009-02-11 : 14:05:13
Right now this process is running on a server that isn't being used for anything else so off hours is not a problem. The problem is that it's on day 2 of running and will probably go another 15 hours until it finishes and I need the results as soon as possible. I was trying to pinpoint a possible hardware fix for this.

I'm not familiar with BCP or SSIS but if those offer much faster results then maybe I need to think about taking that route?
Go to Top of Page

dinakar
Master Smack Fu Yak Hacker

2507 Posts

Posted - 2009-02-11 : 14:57:10
yes bcp is definetely much faster.. I have transferred about a billion rows in 6-8 hrs.. ofcourse this depends on the hardware too - cpu, disk IO etc.. but its faster than insert into... select from ...

Dinakar Nethi
************************
Life is short. Enjoy it.
************************
http://weblogs.sqlteam.com/dinakar/
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2009-02-11 : 15:45:31
It is possible/likely the that the performance of your queries is being limited by the two functions, dbo.CalculateDistance and dbo.CompareText.

They will result in what is basically cursor processing row by row, so depending on what is being done in each function, they can have a big impact on performance.


You might consider doing this query in two steps with the first query as a join between NewBusinessTemp inner join ZipCodeMaster to get the zip codes of interest for each row in NewBusinessTemp, insert those results in a temp table, and then join that to BusinessMaster for your final result.

You should also look at the links below for ideas on how to limit the zip codes that you have to compare in your first query.

Function F_FIND_SEARCH_LIMITS
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=112369

Great Circle Distance Function - Haversine Formula
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=81360


Faster hardware may help your situation, but you will probably get a much larger improvement from optimizing your query.







CODO ERGO SUM
Go to Top of Page
   

- Advertisement -