Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Help with name field search.

Author  Topic 

theGrind
Starting Member

5 Posts

Posted - 2009-10-29 : 01:54:46
Hello,

I have been thrown into this and I am having a problem fixing a query that takes 10-20 or more seconds to complete.

The table has 20 fields most of which are only numbers and one that is a name field...LastName FirstName. << This is the one causing our problem.

The table is 1.5 Million rows with indexes on three numeric fields ...these work fine and query less than a second.


This table is updated nightly...I tried a full text index on the Name column but it took over 4 hours to complete. Not sure if we should do that each night.

Right now we are using SQL like this...

WHERE fullName LIKE '%foster%' AND fullName LIKE '%brian%'

to do a single query for "Foster Brian".....basically it takes 15-30 seconds and is unacceptable.

Our box is:
Win 2003 Server
MS SQL 2000
4g RAM
2.4 Dual Processor

I guess my question is what Indexes or T-SQL would you guys recommend for something like this. We could Index the name field but what sort of SQL would we use to find "Foster Brian" for example. I really mean example... if you can type it out for me that would be greatly appreciated.

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2009-10-29 : 14:25:58
WHERE fullName LIKE '%foster%' AND fullName LIKE '%brian%'

is always going to cause a table scan no matter what you do with the indexes. how is the data stored? Like this? "Foster, Brian" why can't you just search for the name you're looking for?

If you take the leading wild-card out you may get an index seek

WHERE fullname like 'Foster%Brian%'
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2009-10-29 : 15:12:50
I would suggest you split the name into two columns. If the data is fairly static (names rarely change), then this would not be expensive to implement.

________________________________________________
If it is not practically useful, then it is practically useless.
________________________________________________
Go to Top of Page

theGrind
Starting Member

5 Posts

Posted - 2009-10-29 : 16:50:31
russell!!!!!!!

great solution. Queries dropped from 10-20 seconds to 1-2...

Many many thanks.

No leading Wildcard = index seek. Will remember that one for sure.

Thanks again very much.
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2009-10-29 : 21:49:53
Glad to help
Go to Top of Page

theGrind
Starting Member

5 Posts

Posted - 2009-10-30 : 15:47:32
russell!! Don't leave yet! :D

One more question for ya...

There is also an address field that I needed to do the same thing to. Example address: 12 falcon rd

The field is Indexed.

I was in Profiler watching people query the db and saw the query
WHERE addr LIKE '12%falcon%' took 48 seconds and scanned through 1.5 million rows. So I went and found that if the first two characters are numbers it will not do an index seek. If I put like 1h instead of 12 it will index seek. or if the number is 3 characters long it will index seek. Is there anyway to get around this? We have 100K plus two digit addresses.

Thanks.
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2009-10-30 : 22:17:44
the way around is to make the query more selective. only the characters to the left of the wild card will be considered for an index seek. a wild card matches everything so this makes sense. now, depending on the distribution or selectivity of the data, it is far more likely to perform a scan when it uses fewer characters.

consider your 1.5 million record table. there aren't many (100 to be exact) possible combinations that start with 2 numeric digits. this isn't likely to be selective enough to cause a seek, so the optimizer chooses a scan.

in your example, do you really want all of these returned?
12 Falcon Rd
123 Falcon Drive
1231 Falcon Street
1232 Falcon Street
1268 Falcon Street
etc.

Better to search for a closer match:
'12 Falcon%'
or an exact match
'12 Falcon Rd'

Also, on address searches, I'll often require a city or zip code. This will drastically reduce the number of records that need to be scanned:

WHERE zip = '94523'
And Address Like '12 Falcon%'

With long running queries, always look at the execution plan and see what the expensive parts are. Be on the lookout for index scans and table scans. Set Statistics_IO ON and look at the output in the messages tab too. See what table(s) have a significant # of reads and think of strategies to reduce the reads.

I've often said that (for the most part) the practice of query tuning is really the practice of reducing reads.
Go to Top of Page

theGrind
Starting Member

5 Posts

Posted - 2009-11-04 : 21:43:32
russell,

I see. I will just do some more if then stuff when the digits are small.

man...I have been updating over 15 heavy traffic (for us) sites with your help in this topic. Processor has dropped from average 85-100% down to 2% MAYBE 20% on hard load. INCREDIBLE!!

Now I am looking at queries all the time to try and find a bad one. :D

I do have another question of course. :D

Say you have a person Last name Smith first Name Robert

in the data its SMITH ROBERT D

I can get an Index Scan by saying "AND OWNAM1 like '% robert%'"
But do you know any way to get an Index Seek looking for just the first name and nothing else?

Thanks again and again!!!
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2009-11-05 : 08:18:28
Might be time to look at seperating firstname and lastname into their own fields.

That's some serious improvement you got on CPU utilization! They oughtta give you a raise High CPU is almost always big table or index scans.

It's fun finding bad queries and turning 'em into good ones. Keep posting back when you have more questions!
Go to Top of Page

theGrind
Starting Member

5 Posts

Posted - 2009-11-05 : 11:10:03
russell,

Yea one of the problems with splitting it is some are business names and some names have spaces in them....yuck.

But yea HUGE difference in performance wow. It may allow us to put more sites on our first server and have no need for the other. Raise here I come...

Will post here again with questions with your name in the thread title.

Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2009-11-05 : 14:44:57
If you really MUST query by firstname in this sort of condition, may want to create an indexed view of the table(s), adding fisrtname and lastname columns to the view and indexing them.

careful with this though...you'll have to handle values that don't have spaces in them.

something like this:

create view v_ttt
with schemabinding
as

select id, left(fullname, charindex(' ', fullname)-1) as fname,
right(fullname, len(fullname) - charindex(' ', fullname)) as lname
from dbo.t1;
GO

create unique clustered index ci__vttt_userid on v_ttt(id);
go
create index ix_vttt_fname on v_ttt (fname);
go
create index ix_vttt_lname on v_ttt (lname);
go
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2009-11-05 : 15:28:37
My name parsing function. Very reliable for all Western-style names, but of course it may give odd results with Business names.
http://sqlblindman.pastebin.com/f68f37c15

________________________________________________
If it is not practically useful, then it is practically useless.
________________________________________________
Go to Top of Page
   

- Advertisement -