Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2001-01-14 : 22:33:44
|
Paulo writes "Hi there... I'm having a hard time with Full Text Retrieval... I've been asking this question to other guys but unfortunatly they all came out empty with ideas... How can I search a .doc file content (for ex) using sql 2000 query analyser? I've also been having some problems with sql oledb provider(MSIDXS)... I'd really apreciat if you could give me an answer to these questions... also because other guys couldn't...
Thanks for your time,
Paulo José Rocha" |
|
|
RobWafle
Starting Member
38 Posts |
Posted - 2002-03-22 : 17:07:10
|
| Graz is right, this is out of the scope of SQL server, but I don't know of any web sites dedicated to "index server" so this is our best place to discuss this problem. I think I can help. I'm working on a similar problem. I've got everything all working using the Microsoft Index engine.What I've done is used the idexing engine to index the files. I created a linked server so that I can query the Index Server from SQL server. I did this so I didn't have to put my documents in the database.. just the filesystem. This page at microsoft kinda explains adding the Index Server as a Linked Server to SQL in Example G:http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_sp_adda_8gqa.aspIn my case I did this:EXEC sp_addlinkedserver FileSystem, 'fullTextIndex', 'MSIDXS', 'fullTextIndex'GO(I named the linked server and the catalog the same thing.. "fullTextIndex")I can now query the index using this statement.. which is a little wierd because you see the escaped 's or '' in the query.. and it uses a "derived query" to join data back into the sql server..SELECT * FROM OPENQUERY(FileSystem, 'SELECT Directory, FileName, DocAuthor, Size, Create, Write FROM SCOPE('' "c:\My Documents" '') WHERE CONTAINS(''Distributed'') > 0 AND FileName LIKE ''%.doc%'' ') WHERE DATEPART(yy, Write) = 1998GOFor me, this was exactly what I was looking for, as I have 311GB+ (aprox 150GB/year) of documents that didn't need to be copied into my SQL server.The problem I am facing is that the querys take MUCH longer when executed via SQL over the linked server but execute immediately when executed via the mmc console that lets me query the database.I also don't have the Platform SDK and I need to look at advquery.asp... I'm wondering if anyone has that.. I know EVERYONE DOES.. because when i search the web I find it.. but guess what Its ASP so I can only execute it and not get the source!! (I'm ordering it now.. but hey can't hurt to ask) |
 |
|
|
AjarnMark
SQL Slashing Gunting Master
3246 Posts |
Posted - 2002-03-22 : 18:38:18
|
An example much like Rob's above can be found in SQL 2000 BOL under full-text queries / file data.Also, if you look at the heading Full-Text Catalogs and Indexes you will find this note:quote: Formatted text strings, such as Microsoft® Word™ document files or HTML files, cannot be stored in character string or Unicode columns because many of the bytes in these files contain data structures that do not form valid characters. Database applications may still have a need to access this data and apply full-text searches to it. Many sites store this type of data in image columns, because image columns do not require that each byte form a valid character. SQL Server 2000 introduces the ability to perform full-text searches against these types of data stored in image columns. SQL Server 2000 supplies filters that allow it to extract the textual data from Microsoft Office™ files (.doc, .xls, and .ppt files), text files (.txt files), and HTML files (.htm files). When you design the table, in addition to the image column that holds the data, you include a binding column to hold the file extension for the format of data stored in the image column. You can create a full-text index that references both the image column and the binding column to enable full-text searches on the textual information stored in the image column. The SQL Server 2000 full-text search engine uses the file extension information from the binding column to select the proper filter to extract the textual data from the column.
------------------------GENERAL-ly speaking... |
 |
|
|
|
|
|
|
|