Fulltext or Fuzzy solution advice

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Fulltext or Fuzzy solution advice

Author

Topic

G81
Starting Member

2 Posts

Posted - 2013-01-30 : 17:09:31

Hi guys,

I have 2 simple tables. both have 1 nvarchar(100) column and a PK ID column with an auto increment seed. What I want to do is return all of 1 table, and only the 2nd table where the value is very similar. (i.e. a spelling mistake, or the addition/absence of specific key words.). I also want to be able to use a thesaurus if possible. I've gone at this problem two way, but am after any advice or input please:

1) SSIS Fuzzy Lookup
It's working quite well, but not great for small words and cant seem to find a way to use a thesaurus file and to include stop words or noise words.

2) Freetext, Contains, and Formsof
Great functionality, but how would it work for my above example? Is it possible to join both tables together in this way, and 2 only return high scoring matches from the 2nd lookup table?

As an example I might have:

MyCompany Danmark in Table1
MyCompany Denmark in Table2

Danmark and Denmark in my thesaurus file

and therefore an exact match. Also If LTD is included in either table for that row, it'll still return as an exact match due to it being in some sort of stop/noise list. Any idea's on how to implement something like this? And am I on the right track?

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources