Search Your Database with the CONTAINS SQL Function

normalization in sqlThe right SQL statement makes databases the perfect tool for any task. All you have to do is populate your database tables with the data you need, and use your SQL queries to combine them into useful pieces of information. The power of SQL lies in the WHERE clause, which lets you input filters and other search criteria to create the perfect solution to your problem. However, there will be times when you only know a portion of the information you need. Regular SQL logical and equality statements only handle whole pieces of data. If you need to test your database against substrings, you need use the CONTAINS SQL function.

Read about other SQL data manipulation functions at Udemy

The CONTAINS SQL Function for SQL Server

Before we begin, you have to understand that the CONTAINS SQL function is not a standard SQL function. While every database system has the function, they all deal with it differently. The function also requires different arguments for different databases as well. Therefore, before you use it, you must know and understand the database system you use.

Get to know SQL Server at Udemy

With any database, the CONTAINS SQL function for SQL Server checks if one string contains a second string as a substring. For Microsoft SQL Server and similar systems, CONTAINS lets you do full-text term pattern-matching queries on your tables.

By far the most common use of the keyword, CONTAINS requires two arguments and returns a number a result presenting true or false. It will be 1 (true) if it finds a match and 0 (false) if it doesn’t. The first argument is the name of the table column you want searched. The second argument is your substring you need. The basic syntax is as follows.

SELECT columnName FROM yourTable WHERE CONTAINS (columnName, ‘yourSubstring’);

From my example above, you should note that I did not enclose the column name in quotes, but I did enclose the arguments for the CONTAINS SQL function in parentheses.

You can use the CONTAINS function in any SQL query that you can use the WHERE clause, though it is the most useful with SELECT statements. With SELECT, CONTAINS returns all records where the inputted column contains the substring. For example, the following sample only returns records from New York City that contain the top ten phrases containing words with the letter “a” in them.

SELECT DISTINCT TOP 10 phrase AS Result FROM phrases_table WHERE CITY=’New York City’ AND CONTAINS (phrase, ‘a*’);

Result:

PhraseIDResult
23345A Group Creation
67872Silver Hut A House of Jewellery
36274DOEACC A Level Course
58444A V Jewellers
11234A One Tent House
37098The Azad A Complete Wedding Planner
5990A One
60636A One Beauty Parlour
74257A One Band
76122D A International Travels

 

The * in the substring argument is a wildcard placeholder that represents all the other characters in a word. I did it that way because SQL Server wouldn’t return anything. SQL Server treats the substring as a complete word. If I just used ‘a’, SQL would ignore it as “a” is one of the noise words SQL Server ignores, along with “the” and “an”. Often called particles, noise words are defined by the specified language. Because they show up so frequently, SQL Server ignores them all along with punctuation marks. You can check which words SQL Server considers noise by looking at the noise file in your SQL Server’s installation directory.

In the last example, I only searched for a single substring. However, you can include as many substrings in the argument as you need. You have to enclose each additional substring in double quotes and separated by a logical operator. For example, the following query brings backs all documents that have either “replacing” or “pedals” it their summary.

SELECT DocID, DocSummary FROM production.dosuments WHERE CONTAINS (DocSummary, ‘”replacing” OR “pedals”’);

Like as before, SQL Server searches for these words as substrings in their entirety. It does not look for synonyms or inflections. Because I didn’t include wildcards, it won’t search for these terms inside other words wither.  I used the operator OR in the above example to return results containing either word, but you could switch it out for AND if you need to have both words included.

CONTAINS SQL Function with other Databases

That is all you need to know to use COTAINS in your SQL statements with the SQL Server system. As I mentioned before, other databases may use the keyword as well, but they all use it differently. In these other systems, you need to use the LIKE statement to achieve the same results as above.

See how to use the LIKE statement in SQL at Udemy

Oracle Database

With the Oracle Database System, CONTAINS works much like the SQL Server version, but with two additional arguments. The basic syntax looks like this:

CONTAINS (column_name, substring, label, policy_hint)

Column_name and substring are the same as they are with SQL Server.

Label must be a number and it represents the score for the CONTAINS function. It is optional except when you have to use CONTAINS more than once in a query. If you use the SCORE function, it must refer to this value.

The string policy_hint identifies the policy in affect for columns that have multiple policies. You must include a label to use it, but at least you only have to specify the desired policy name. It defaults to the first available policy if you leave it blank.

MySQL

MySQL only recognizes the CONTAINS SQL function when dealing with spatial data. It requires two graphics objects as arguments, and returns a 1 or 0 depending on if the first object completely contains the second. Designed as an implementation of the OpenGIS framework, the MySQL CONTAINS function does not work on ordinary strings, and will produce an error if you try it. MySQL only recognizes the LIKE and STRCMP functions when dealing with string of information.

Other Databases

I only included Oracle and MySQL as they are the most popular alternatives to SQL Server. As you can see, all three databases use the CONTAINS SQL function differently, and the same will be true for any other database system you may come across including Firebird. I recommend that you check to see if your system understands the CONTAINS function before you try using it in your queries. If you can use CONTAINS, you should use it as often as you can. It’s much faster than LIKE, and self-validating. It’s the perfect solution when having to search through thousands of records for particular words or phrases.