The SQL INSTR Function and Finding Locations in a String

sql instrThe SQL INSTR function allows you to find the starting location of a substring within a string. It is used by Oracle SQL and MySQL; many other SQL implementations have functions which are the exact or near equivalent.

It’s worth you while to get acquainted with basic SQL functions such as INSTR, since SQL programming is a skill that’s in high demand. To find out more about SQL, you might want to check out some of the online classes that help you master Master MySQL.

SQL INSTR: The Basics

INSTR searches for a substring within a string and returns its starting location in the string, using the syntax INSTR(string,substring). This means that if you tell INSTR to look for “berry” in “strawberry” it will return 6, because “berry” starts at position 6 in “strawberry”:

INSTR('strawberry','berry')

In an actual SQL query, you might use it like this:

SELECT INSTR (Fruit, 'berry')
FROM Food
WHERE Fruit = 'strawberry';

Note that if INSTR does not find the substring, it returns the number 0 (zero).

Oracle SQL vs. MySQL

The basic two-argument version of INSTR is common to both Oracle SQL and MySQL, but in Oracle SQL, you can optionally use two other arguments — the starting position, and which occurrence of the string you’re looking for, like this:

INSTR('she sells seashells','s',2,3)

This tells Oracle SQL to look for the third instance of “s” starting from position 2. It would skip the “s” in “she” (because it’s in position 1), and return 11, the position of the first “s” in “seashells”.

Note that regardless of the starting position (or, as described below, the direction of the search), the position returned by INSTR and the other string-index search functions described here is always based on the substring’s position in the string itself, starting at the beginning of the string, and not its position in relation to the start or direction of the search.

Oracle: Refining the Search

What happens if you switch the last two arguments?

INSTR('she sells seashells','s',3,2)

Now it starts at position 3 (still skipping the first “s”) and looks for the second instance of “s”, returning 9, the position of the second “s” in “shells”.

The Oracle SQL version of INSTR also allows you to count backward, by entering the position argument as a negative number:

INSTR('she sells seashells','s',-2,5)

This will tell Oracle SQL to start at the second character from the end of the string (the last “l” in “seashells”) and search forward until it finds the 5th instance of “s” — in this case, the first letter of “she”, which will return 1.

Using Position Arguments in MySQL

The MySQL version of INSTR won’t take starting position and instance arguments, but you can partly get around this limit by using the LOCATE function instead of INSTR:

LOCATE('s','she sells seashells',3)

LOCATE (which, as you can see, reverses the order of the first two arguments, like this: substring,string) will take a starting position argument, although not an instance argument. You could, however, use LOCATE to find all instances of a substring by using a loop that successively sets the starting position to the last instance +1.

The SQL Server Equivalent

Microsoft SQL Server’s Transact-SQL (or T-SQL) language does not include INSTR, but its CHARINDEX function works in basically the same way as LOCATE:

CHARINDEX('s','she sells seashells',3)

As this example shows, CHARINDEX takes the same kinds of arguments as LOCATE, and will return the same result. CHARINDEX, like LOCATE, does not take an instance argument, so a loop with incremented starting positions would be required to find all instances of the substring.

Extended INSTR Functions in Oracle SQL

Oracle SQL makes the most extensive use of INSTR. The optional position and instance arguments add significant functionality, of course, but Oracle SQL also includes some variations on INSTR based on the way that the string length is calculated. Strings and substrings themselves can be of the following types: CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB; if you use the basic INSTR function, the string length will be based on the system’s default character set.

To count string length in double byte characters, use the INSTRB function:

INSTRB('strawberry','berry')

While INSTR returns 6 as the starting position of “berry”, INSTRB will return 12, because it assumes that “strawberry” is in a double-byte character set, so it counts character positions in bytes (two per character), and returns the presumed starting byte of “berry”.

The other three Oracle SQL variants on INSTR (INSTRC, INSTR2, and INSTR4) calculate length using Unicode characters. INSTRC assumes that the string is in Unicode complete characters, while INSTR2 assumes that it is UTC-2 (the now obsolete precursor to UTF-16), and INSTR4 assumes UTC-2 (UTF-32). Since INSTR itself assumes that the target string is in the system’s default character set, you will generally not need to use these variants unless the string involved is in a character set other than the system default.

Getting Regular

Oracle SQL includes one other INSTR variant which you should know about, and which can be of considerable value, as long as you are familiar with regular expressions. The REGEXP_INSTR functions uses a regular expression rather than a substring, and includes some extended search options, allowing you to make greater use of the capabilities of regular expressions. (Note that while REGEXP_INSTR is available with 11g Release 1 and later versions of Oracle, it may not be available with earlier versions.)

The basic REGEXP_INSTR syntax looks like this:

REGEXP_INSTR(string,pattern,position,instance,return option,match parameter,subexpression)

Some of these should be familiar by now: the search string, starting position, and instance of the item that you’re searching for. The pattern, however, is a regular expression, rather than just a substring. The regular expression can be up to 512 bytes in length, and REGEXP_INSTR’s regular expressions are compliant with both the POSIX and Unicode standards. (If you don’t know what regular expressions are, there are some good online resources — including classes — for finding out about them, and about UNIX/Linux-based programming and related software development issues.)

The other parameters tell the function whether to return the position of the search character or the following character, and allow you to set the matching behavior and a subexpression of the regular expression. Needless to say, REGEXP_INSTR can give a skilled Oracle programmer very precise control over a string search.

There is, of course, much more to find out about SQL programming, about Oracle SQL, and about programming in general; the online world includes some excellent resources for learning the ins and outs of Oracle, of SQL, and of the exciting (and growing) field of database programming.