70-451 – Designing a Database Solution – Full Text Search

Requirements

Design a Full-Text Search strategy.
This objective may include but is not limited to: CONTAINS, CONTAINSTABLE, FREETEXT, FREETEXTTABLE

My Take

Perhaps I’m wrong, but this requirement sure seems to me that we are only concerned about the Querying of the full text, and not the Overview, Architecture, or Administration. Even with that said, I still think it would be a good idea to take the time to read through the rest of the literature on Full Text Search.

One of the first places to start for all of the listed requirements is the Full-Text Predicates and Functions Overview.

There is a white paper available on the subject of Full Text Searching: SQL Server 2008 Full-Text Search: Internals and Enhancements

Supported Query Terms

I didn’t want to repeat myself for each of the full-text predicates or functions, so the following is a list of the supported query terms:

  • Weighted Values – Only used in the CONTAINSTABLE, this allows you to set a weight value for certain words in the phrase list that would help to rank the results.
  • Proximity – only used in the CONTAINSTABLE, this allows you to filter the records based on whether a word or phrase is next to another word or phrase.
  • Prefix – Used in CONTAINS and CONTAINSTABLE, prefix searches can be performed by wrapping the term in double quotes and using an asterick (*). This will return the exact match and fuzzy matches for all words in the phrase prior to the wild card (but will not allow additional words to be placed in the middle of the phrase).
  • Boolean Logic – CONTAINS and CONTAINSTABLE use the boolean words AND, OR, and NOT to perform boolean logic between the words or phrases.
  • Inflectional Terms – this is the default for FREETEXT and FREETEXTTABLE, but can also be used in CONTAINS and CONTAINSTABLE with the optional inflectional argument. Inflectional terms search for different tenses of a verb, or the singular and plural form of a noun.
  • Synonyms – Default for FREETEXT and FREETEXTTABLE, optional for CONTAINS and CONTAINSTABLE with the THESAURUS keyword. Synonym searching allows the match of user specified synonyms for terms. For instance, Red, Blue and Green might be synonyms for Color. When you search for color, you will receive records that contain color or any of the synonyms for color. The thesaurus will need to be properly configured to get the best use from this operation.

NOTE: you can specify the language you would like to use for the search for each of the full text search types. Each language will have a different set of rules for word breaking and stemming, different stop words and will use a different thesaurus.

CONTAINS

CONTAINS is a filter used in the where clause to return only those records who match the search criteria on the column (or columns). CONTAINS is generally used for exact, or fuzzy, matching of the given phrase. Contains can be used to search for a word or phrase, a prefix of the word or phrase, a word in proximity of another, a word which is an inflectional term of another and a word that is a synonym of another.

SELECT Description
FROM Production.ProductDescription
WHERE CONTAINS(Description, ‘ FORMSOF (INFLECTIONAL, ride) ‘);

CONTAINSTABLE

Similar to CONTAINS, but this is used in the FROM clause to return a record set of zero one or more rows. The CONTAINSTABLE can also return the relevance ranking value (RANK) and full-text key (KEY) for each record returned. Furthermore, you can limit the number of records returned, by rank, with the top_n_by_rank clause.

SELECT FT_TBL.Description, FT_TBL.CategoryName , KEY_TBL.RANK
FROM Categories AS FT_TBL
INNER JOIN CONTAINSTABLE (Categories, Description,
‘("sweet and savory" NEAR sauces) OR
("sweet and savory" NEAR candies)’, 10)
AS KEY_TBL
ON FT_TBL.CategoryID = KEY_TBL.[KEY]

FREETEXT

FREETEXT, like CONTAINS, is used in the where clause as a filter to return only records which match the given criteria for a given column (or columns). Unlike CONTAINS, FREETEXT is generally used to match the values for the meaning, and not the exact word, of the word or phrase in the search criteria. FREETEXT internally resolves a series of steps to identify the weight or rank of the results by performing word breaking to separate the string into individual words on the word boundaries, generating the inflectional forms of each word and identifying a list of expansions or replacements based on the thesaurus.

SELECT Title
FROM Production.Document
WHERE FREETEXT (Document, ‘vital safety components’ );

FREETEXTTABLE

Similar to FREETEXT, but this is used in the FROM clause to return a record set of zero, one or more rows. FREETEXTTABLE, like CONTAINSTABLE will return the ranking value and full-text key for each record returned. You can still limit the number of records returned with the top_n_by_rank clause.

USE Northwind;
SELECT FT_TBL.CategoryName
,FT_TBL.Description
,KEY_TBL.RANK
FROM dbo.Categories AS FT_TBL
INNER JOIN FREETEXTTABLE(dbo.Categories, Description,
‘sweetest candy bread and dry meat’) AS KEY_TBL
ON FT_TBL.CategoryID = KEY_TBL.[KEY];
GO

Advertisements
This entry was posted in SQL and tagged , , . Bookmark the permalink.

3 Responses to 70-451 – Designing a Database Solution – Full Text Search

  1. Robert says:

    Test 70-451 is a developer test. That\’s why it is concerned with the querying of the full-text engine and not the overview, architecture, or administration of it.

  2. Pingback: MCITP 70-451 Links Page | Destination: Change

  3. Pingback: Designing a Database Strategy (13%) « SQLThoughts

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s