Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 Query Speed

Author  Topic 

worlwidewall
Starting Member

4 Posts

Posted - 2012-04-18 : 20:57:25
Hi

I have a table with about 9 million records. The first column is id INT NOT NULL auto_increment primary key, the second column is CHAR(14). The second column contains 14 character text codes. I need to search the text codes to see if there is a match (I don't care how many matches there are just as long as there is one. I need to be able to search for partial strings using wildcards (%) and full strings. I run the query and use limit 1; to return only one value. This works fine but in some cases it takes up to 8 seconds. I need this to be less than 1 second.

Any pointers?

Thanks

X002548
Not Just a Number

15586 Posts

Posted - 2012-04-18 : 22:09:14
What ever you do will scan

Are you using SQL Server?

Auto Increment sounds like Access

Never heard of 9 million rows in Access though

Go to Top of Page

worlwidewall
Starting Member

4 Posts

Posted - 2012-04-19 : 18:21:54
I am using MYSQL. I had it in Access before, it handled 9 million ok, more even but my file size was pretty close to max. In some cases it is actually faster in Access which is a drag since part of the reason I switched it to mysql was because I thought it would speed things up.
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2012-04-20 : 03:21:00
Just so you know this is a MS SQL Server forum so all the advice given here might not apply.

But how do you search this string? Are you searching for any combination of these 14 characters or is it possible to separate this column in to two or more columns maybe (denormalization)? Just an example: if your data looks like this "KJU20120420-HJ" then you could maybe put the KJU in one column, 20120420 in another and HJ in a third column. If this is possible you can put indexes on the new columns and searching would be a lot faster.

- Lumbago
My blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/
Go to Top of Page

worlwidewall
Starting Member

4 Posts

Posted - 2012-04-20 : 21:49:16
Hi Lumbago

Thanks for the pointer. The data can't be split like that. There is no set pattern.

Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-04-21 : 04:20:15
If you have the space enough, you can split all permutations.

A string of length n can be split in n * (n + 1) / 2 rows

Peter = 5 * 6 / 2 = 15

P
Pe
Pet
Pete
Peter
e
et
ete
eter
t
te
ter
e
er
r


If store all possible [distinct] permutations in an auxiliary table, your search will ge super fast!



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-04-21 : 04:24:20
If all data is 14 characters, the auxiliary table will store 14 * 15 / 2 = 105 rows [maximum] for each row in the original table.
For an original table of 9 million rows, that would lead to an auxiliary table of 1 billion rows, and that is not practical.

What you can do instead, is to store all possible distinct permutations in an auxiliary table, and have a comma-concatenated string holding the original unique key value. It is the search that is slow and now will be superfast. It's then trivial to unseparate all id and report which rows that had a match.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2012-04-23 : 03:11:04
Do you have the option of installing SQL Server Express instead? MySQL is very solid for standard querying (joins, indexed lookups, etc) but I don't know if the same applies for wildcard queries like yours. It would be worth a shot I think...the query engine in sql server is far more advanced than the one in MySQL.

- Lumbago
My blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/
Go to Top of Page
   

- Advertisement -