Author |
Topic |
Kristen
Test
22859 Posts |
Posted - 2007-10-06 : 05:52:58
|
If a query includes some "table stuff" and some "free text" stuff is the "free text stuff" constrained to the rows that are matched by the "table stuff"?I've been doing some optimisation work for a client and they had something likeSELECT ...FROM MyTableWHERE Active = 0 AND CONTAINS(...) I'm confident that the core table query is efficient (I know it doesn't look it in this snippet).We moved 80% of the data (roughly from 1,000,000 rows to 200,000) to an "archive" table and the query speed has gone up dramatically, and "busy time" CPU has dropped from 80% to 30%I'm inclined to think that the CONTAINS was looking at everything in the Free Text Catalogue, rather than just the rows from the table-stuff query.Any thoughts?Kristen |
|
rmiao
Master Smack Fu Yak Hacker
7266 Posts |
Posted - 2007-10-06 : 12:32:05
|
Yes, full text index query uses full text catalogs. That's why query result will be off if don't populate catalog properly. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-06 : 13:16:12
|
Sorry rmiao, but I haven't understood the significance of your comment.We have a full text catalogue just for this table. it is correctly populated (as far as I know).Does the CONTAINS always search the whole of the Catalogue, regardless of number of rows the rest of the (SQL) query returns?Thanks,Kristen |
 |
|
rmiao
Master Smack Fu Yak Hacker
7266 Posts |
Posted - 2007-10-06 : 19:48:38
|
Yes as I understand. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-07 : 02:33:30
|
OK, that explains the performance problems they have been having, thanks for that.Kristen |
 |
|
iminore
Posting Yak Master
141 Posts |
Posted - 2007-10-07 : 12:51:16
|
More generally a statement such as: |
 |
|
iminore
Posting Yak Master
141 Posts |
Posted - 2007-10-07 : 12:53:46
|
More generally a statement such as: select * from T where col1 = 'A' and col2 = 'B'Does this read all the col1s and all the col2s or just the col2s where col1 = 'A' ? |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-07 : 13:17:29
|
In general (but you would have to check the specifics for a "real" case):If neither column is index SQL Server will do a table scan - that's like thumb'ing through every record checking if col1 = 'A' and col2 = 'B'.If one is indexed AND the index has high selectivity then that index will be used (so matching whichever column is indexed). Then the rows for that match are retrieved and the test is made for the other column.If both columns are contained in a single index (and I think one of them will need to be the FIRST key in that index), then the index will "cover" the WHERE clause, and the col1 = 'A' and col2 = 'B' will be done and then the row will be retrieved. Again, the index needs to have high enough selectivity.If the "SELECT *" was actually a list of a few columns which was ALSO "Covered" by the index then the whole query would be solved from the index without actually retrieving the record at all. Clever stuff eh?!The "Selectivity" of the index is "guessed" by using the Statistics, which provides the Query Optimisers with a Hint/Best-Guess as to whether the query will be more quickly answered from the Index, or just by thumb'ing through all the records.Make sure your Statistics are kept up to date through good housekeeping / maintenanceKristen |
 |
|
iminore
Posting Yak Master
141 Posts |
Posted - 2007-10-08 : 04:21:06
|
I was under the impression that while an index helps in sorting it doesn't help in searching. From what you say I'm wrong. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-08 : 04:46:40
|
"From what you say I'm wrong"Yup, an appropriate index will help MASSIVELY in searching Kristen |
 |
|
iminore
Posting Yak Master
141 Posts |
Posted - 2007-10-09 : 04:17:20
|
hmmThat's a big word 'Appropriate'. Go on - give me a clue. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-09 : 04:38:16
|
"'Appropriate'"Thought I'd covered it above.Index must be Selective and Relevant to the query.An index on a Date column won't help you search by the Name column An index on a Yes/No column (where the data is 50:50) is not going to be sufficiently selective, and won't be used. (Although an index on Yes/No which is 1:99 might well be used to speed up finding the 1% choice).An index for which the Statistics are not fairly regular updated, via a decent Housekeeping/Maintenance strategy, are going to send SQL Server down a wrong turn - i.e. may caused an Index to be used when it is a poor choice (e.g. lots of data deleted since Statistics updated) or fail to use an index (e.g. lots of data inserted)Kristen |
 |
|
jackv
Master Smack Fu Yak Hacker
2179 Posts |
Posted - 2007-10-09 : 05:36:11
|
Another thing to consider , is that sometimes you can return far more records than required from the Full Text Catalog. For example , if you have a search of different cars , with all the various sub categories , which might be published in different publications , you might consider with coming up different strategies in filtering the data at the Full Text Catalog , PRIOR to it returning to sql server. For example , filter on "car" "red" "Publication=myPub"(which is something you add to the record) . This may require some string manipulation , to format text for the client sideJack Vamvas--------------------Search IT jobs from multiple sources- http://www.ITjobfeed.com/SQL |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-09 : 06:43:21
|
The client is currently searching on any combination that the user provides. So if they ask for "Red Car" they do a CONTAINS for that. However, there are several columns in the Free Text Catalogue that could contain those terms, so in effect the query becomesCol1 CONTAINS("Red Car")OR Col2 CONTAINS("Red Car")CONTAINS(Col1 , '"Red*" OR "Car*"')OR CONTAINS(Col2 , '"Red*" OR "Car*"')...Maybe it is possible to have Col1 and Col2 share the same Catalogue, and then just search the Catalogue?Edit: Slightly more realistic syntax for Contains!Kristen |
 |
|
jackv
Master Smack Fu Yak Hacker
2179 Posts |
Posted - 2007-10-09 : 08:56:46
|
The following is dependant on the data model, which hasn't really been outlined. I was thinking more of adding like prefix information to a given column , which would allow the records to be further filtered such as : CONTAINS(Col1 , ('"Red*" OR "Car*" AND "Catlog=example"'),you would place the text = "Catalog=example" in the column at the data entry point .Therefore , the recordset coming back from the Full Text would be less .At the SELECT point you would do a Replace to make sure the "catlog=example" doesn't go through to the client side. This would decrease CPU pressure Jack Vamvas--------------------Search IT jobs from multiple sources- http://www.ITjobfeed.com/SQL |
 |
|
Kristen
Test
22859 Posts |
Posted - 2007-10-09 : 09:35:00
|
That's an interesting idea.The more I think about it the more I think their dynamic SQL is jacking up the size, and number of iterations , of the query.I'll ask them for some real-world examples out of SQL Profiler and see what they look like.Thanks for your help,Kristen |
 |
|
|