VBA RegEx: How to Evaluate and Use “Regular Expressions”
VBA or Visual Basic for Applications is a powerful programming tool in Microsoft Excel. Regular Expression class enables VBA programs to perform regular expression searches. The latter are text patterns that can help search for strings including email addresses, phone numbers and others with specific and defined criteria. In this intermediate level tutorial, we walk you through VBA RegEx. We assume that you have basic knowledge of VBA in Excel. If not we recommend that you go through our basic course on Excel VBA before proceeding further. You may also want to quickly read through this tutorial to get a feel of all the things you can accomplish with VBA.
What is VBA Regex (Regular Expression)?
Quantifiers are needed to match several characters. Here are a few of them.
|–||Stands for a range||a-z means all the letters a to z|
|||Stands for any one of the characters quoted||[abc] means either a, b or c.[A-Z] means either A, B, …, Z|
|()||Used for grouping purposes|
||||Meaning is ‘or’||X|Y, means X or Y|
|+||Matches the character one or more times||zo+ matches ‘zoo’, but not ‘z’|
|*||Matches the character zero or more times||“lo*” matches either “l” or “loo”|
|?||Matches the character zero or once||“b?ve?” matches the “ve” in “never”.|
- Special Characters
Regex covers a lot of special characters. We look at some of the commonly used ones.
|.||Matches any single character|
|$||Matches the end of the input|
|\n||Matches the newline character|
|^||Matches the beginning of the input|
|\t||Matches the tab character|
|\d||Matches a digit in the range 0 to 9|
|\D||Matches a non-digital character|
|\w||Matches any alphanumeric character including “_”|
|\W||Matches any non-alphanumeric character|
|\s||Matches any white space characters including tab, space, newline, form feed|
|\S||Matches any non-white space character|
|\r||Matches a carriage return character; enter key|
|\||If you want to match special characters. For example “\.”, “\\”|
|\v||To find the vertical tab|
- Character Classes
Characters can be grouped by putting them inside square brackets.
|[xyz]||Match the character x,y or z.|
|[a-z]||Match any of the characters between a and z.|
|[^abc],[^0-9]||A “^” caret at the beginning denotes “not.” Here the characters other than a,b,c is matched.|
|[+*?.]||For your information most special characters have no meaning inside the square brackets. This particular expression matches any of the special characters within the square brackets.|
Are regex cases case sensitive? The answer is “yes” and “no”. It all depends on the way you write the regex code and call them in your programs.
The Limitations of Regular Expression
Certain patterns in a string cannot be matched by regular expressions. Here is an example that counts the number of occurrences of a pattern. The pattern is “anbn“. Here the characters “a” and “b” are both repeated n times. The reason being regular expressions cannot keep track of the number of occurrences.
Where is a large amount of data to process it’s wise to avoid RegEx. The reason is, if you do not use the RegEx pattern properly, it can lead to low efficiency of the program. To learn more about VBA regular expressions and how you can use them, check out this course.
Accessing VBA RegExp Object
Open your Microsoft Excel application. Click on Developer tab. From here you choose Visual Basic.From the visual Basic menu bar, choose Insert -> Module.
After the module opens a new sheet, go to Tool tab and select References. Here you get a list of items with check boxes. Search for VBscript Regular expressions. Check the corresponding checkboxes.
Regular expressions have three RegExp Methods and four RegExp Properties. Let’s take a look at each of them:
- Execute method– In this method, a match or matches of pattern from a string is extracted.
- Replace method– Here the character or string is searched, and once found it is replaced with a new character or string.
- Test- This method is used to find whether a Regex pattern is matched in a given string. The method returns either “True” or “False.” Test method is equivalent to testing whether the number of matches found is greater than 0.
Properties of RegExp Method
- Pattern– It can be a single character or a string of characters. You can go through the patterns mentioned in the above tables and use them in your programs as per your requirements.
- Global– The argument for this property is either “True” or “False.” It is “False” by default.If you set the parameter to “True” it finds all the matches in a string. However, if by default it will only check for the first match.
- IgnoreCase– This parameter accepts either “True” or “False.” By default it is set to “False” and it is case sensitive. If you want the search to ignore the case, then set the value to “True”.
- MultiLine- This parameter accepts either “True” or “False.” If the value is set to true, it will search each line of a multiline string. While the “False” searches only the current line.
Let’s use the properties listed above in a sample program
Example 1: VBA program to test whether the string is a valid email address
Set objRegExp_1 = CreateObject("vbscript.regexp")
objRegExp_1.Global = true
Set regExp_Matches = objRegExp_1.Execute(strToSearch)
If regExp_Matches.Count = 1
MsgBox("This string is a valid email address.")
In this program, we declared sub SubroutineRegEx_Tester().
Set objRegExp_1 = CreateObject(“vbscript.regexp”), this statement creates the regular expression object from VBScript. The object has functions which perform regular expression searches. If you want the regular expression to match the string, just for the further occurrence of the string set, “objRegExp1.Global = true.” If you want to match for all occurrences of the string set “objRegExp.Global = false.” If you want your regular expression to ignore the case of string, set “objRegExp1.IgnoreCase=true” If case is important set this property to false.”objRegExp_1.Pattern=[a-z,A-Z]*@[a-z,A-Z]*.com” this pattern denotes the string with alphabetical characters before and after @ symbol. If the regular expression object match the text in your input string set “regExpMatches.Count=1”
Here is another VBA RegEx Method program which finds and replaces string. Let’s take a look.
Example 2: Excel VBA RegEx Replace function
Function RegExpReplace1(ByVal WhichString1 As String, _
ByVal Pattern1As String, _
ByVal ReplaceWith1As String, _
Optional ByVal IsGlobal1As Boolean = True, _
Optional ByVal IsCaseSensitive1As Boolean = True) As String
Dim objRegExp As Object
Set objRegExp = CreateObject("vbscript.regexp")
objRegExp.Global = IsGlobal1
objRegExp.Pattern = Pattern1
objRegExp.IgnoreCase = Not IsCaseSensitive1
RegExpReplace = objRegExp.Replace(WhichString, ReplaceWith)
In this program, the user can enter the values at the runtime because of the parameter “ByVal” is used.The function RegExpReplace1 is called with the value of the parameters. One parameter is the string to be searched; another parameter is the replacement string. Note that Global property is set to true. This means all the matches of a string are searched. This program searches for occurrences of a string and substitutes each of them with the replacement string.
You may also want to check out the the neat VBA examples in this course by MrExcel.
Example3: Function to extract only numbers in a string
Function OnlyNumbers_1(ByValWhichString As String) As Variant
OnlyNumbers = CDbl(RegExpReplace(WhichString, _
"[^0-9]", vbNullString, True))
Note that CDBl() converts an expression into type double.
More on Regex.Replace Method
Regex.Replace Method in a designated input string replaces strings that match a regular expression pattern with a specified replacement string. This method is overloaded, i.e. the same function name is used with different numbers and types of arguments. What function is called is determined by the number and type of parameters passed.We look at a few of the overloadedReplace functions.
Replace(String, String, Int32) in a designated input string, replaces a specified maximum number of strings that match a regular expression pattern with a specified replacement string.
Replace(String, String, String) – In a designated input string, this function replaces all strings that match a specified regular expression with a specified replacement string.
Replace(String, String, Int32, Int32) – In a in substring, the function replaces aspecified maximum number of strings that match a regular expression pattern with a specified replacement string.
Programming is best learned by trying it out yourself. You have to play around with the code and see how to things work out. Do try out these examples for yourself and let us know your feedback. If you ever need a quick refresher course, feel free to take this awesome Excel VBA course by Infinite Skills.
Top courses in Excel VBA
Excel VBA students also learn
Empower your team. Lead the industry.
Get a subscription to a library of online courses and digital learning tools for your organization with Udemy Business.