Excel Substring Functions: How (and Why) to Use Them

When you have a column of some 500 text entries in Excel and you want to remove the same characters from all of those entries, what’s the best way to do that?

If you said “backspace” you need to read on.

Among the many useful features packed in to Microsoft’s spreadsheet program, there are a set of text functions that will allow you to do exactly this type of data cleansing easily. And, if you get to understanding these functions well, you will find a range of additional situations where they will most certainly come in handy. Get creative with them, and they will generally help with your problem solving with Excel.

You can learn all about these functions as part of an intermediate course in Microsoft Excel, which is a particularly good idea if you need to polish your spreadsheeting skills. Here, we will focus in on the use of three functions: Left, Mid, and Right. You’ll also learn briefly how to use these along with the Find function in a larger formula.

Left, Right, and Mid

The Left, Right, and Mid functions extract a substring (part of a text entry), of a length you specify, from text in a cell. The Left function starts the extraction from the left, and the Right, obviously, begins from the right.  The Mid function starts at a point you indicate within the function.

That might initially sound confusing. But as is the case for many functions in Excel, these are probably best learned by example, so let’s look at one.

Say you imported a long product list in to a spreadsheet, and for whatever reason, the source you imported it from added an asterisk (*) at the beginning of each cell containing the Product ID. For instance:

 A 1 *ABC-123 2 *DEF-456 3 *GHI-789

You want to cleanse the data, but you don’t want to spend the rest of the afternoon pressing the backspace key just to get the one character out of there. So you need a substring function.

In this specific case, you can use the Right function to clean this up. The function can be entered directly in to the cell in the format =RIGHT(text, number of characters).

Within the parentheses, the text parameter will take a cell reference, so here we will plug in A1 for the first entry. For the number of characters, you will use seven for this example since the product ID contains three letters, three numbers, and a hyphen. Thus, the function in your first cell will look like this:

=RIGHT(A1, 7)

Now, if the unwanted character appeared at the other end of the text (e.g. ABC-123*) you would use the Left function, rather than the Right, to extract the characters you want to keep. And this function would use the same parameters:

=LEFT(A1, 7)

So when does the Mid function come in to play? Well, in some cases, you might have characters on both ends that you don’t want to include (e.g. *ABC-123*). And in those cases, the Mid function will be your best option.

The Mid function uses a slightly different syntax. Since you are extracting a substring from the middle of an entry, you need to indicate the starting point. So in the product ID example, you would write 2 to indicate that you want to start with the second character. The complete function, then, would be:

=MID(A1, 2, 7)

With any of these functions, once you have the first instance in your initial cell, you will simply copy it down to the remaining cells.  If you wish, you can copy the cleaned up entries and paste their values over the original cells using the Values option from the Paste Special menu.  If those operations sound unfamiliar, you may want to revisit these and other procedures in an online Excel course.

Using substring functions with =Find()

Combining functions often helps solve problems in Excel, which you can learn about in an online course dedicated to Excel functions and formulas.  And certainly, you can extend the use of substring functions by combining them with other functions.

One of the most useful functions to combine them with is Find.  Essentially, the Find function offers a way to modify the character number inputs to make them more flexible. Its basic format is as follows:

=Find(find_text, within_text, [start_number])

Again, the best way to learn this is by example.

Suppose you are given a list of email addresses for a company, and they are in the following format:

 A 1 John.Doe@ABCCompany.com

You want to turn these addresses into a contact list that also includes the contacts’ first and last names.  So, rather than going through each entry and typing those details in separately, you can use substring functions, combined with the Find function, to separate out the names.

In this example you can specifically use the Left function and the Mid function.  Start by setting up a column for the first name and a column for the last name. In the column for the first name, you will want to set up the Left function with a Find function embedded in it to specify where the text extraction should end.

Taking the example above, you want to extract the name John. And as you can see, that name ends with a period in the context of the email address.  So for that reason, you want to identify the period as the end point for your Left function.  That is where the Find function comes in to play. By using the Find function to identify the location of the period, you can set your end point and extract the name.

Ultimately to do this, your find function will look like this:

Find(“.”, A1)

Simply put, you are indicating that you want to locate the period within A1.  This returns the location of that period as a numeric value.  And as such, you can use it as the end point for your left function.  So the formula for the first cell in your first name column will look like this:

=LEFT(A1,FIND(“.”,A1) -1)

With this function, you have identified the result of your Find function as the number of characters to extract.  Note that the “-1” is also included because we will not want to have the period included in the first name result, so you need to move the end point over one space to the left.

Next, for the last name, you will use the Mid function to extract the contact’s last name from between the period and the “@” symbol.  Applying what you just learned about the Mid function and the Find function, you should have a good idea of how to do this.  Since you need to determine both the start and and values for the extraction, you will need to use two Find functions.

Your function will look as follows:

=MID(A1, FIND(“.”, A1)+1, FIND(“@”, A1)-1)

If that all moved a bit fast, don’t worry.  You can get some more detailed instructions and guided examples in an advanced course on Excel.  Hopefully, though, this quick guide has helped you get a handle on how these functions work so that you can start putting them to use and saving yourself a lot of precious time.