Display Special Characters with Escape Sequences

escape sequencesUnless you’re programming in Assembly, you have to use ordinary letters, numbers, and other symbols to write you code. There’s no way out of this. Regardless of your programming language of choice, every keyword, operator, and symbol you will ever use comes from real world languages. This wouldn’t be so bad if we didn’t need to use strings in our program. But, any program worth using has to interact with the user, and strings are the only tools we have to accomplish this. Fortunately, all programming languages come with a set of string escape sequences that we can use implement special characters and phrases in our strings.

See how to use escape sequences in C++ at Udemy

What are escape sequences?

According to Wikipedia, escape sequences are series of characters that change the state of computers and their attached devices. Also known as control sequences, these series of characters let us handle computer tasks that would be otherwise impossible given how computers work.

All escape sequences begin with an escape character that signals the start of the sequence. In the past, this character was a dedicated character sent by the Esc key, but today, most programming languages use the backslash (\) instead. Regardless of the escape character, all escape sequences let us access and display non-printable characters and other special hardware features through our programs.

Standard Escape Sequences

While all programming languages have their own set of escape sequences, many share the same basic standard set of characters developed for the C programming language. You can find the most common sequences in the table below. Just remember that these codes are case sensitive. Using the wrong case can cause runtime errors and other bugs.

\’Single quote
\”Double quote
\aAudible bell
\bBackspace
\fForm feed
\nNewline
\rCarriage return
\sA space or white space.
\tHorizontal tab

Besides the ones in the table, the standard set includes sequences for the entire Unicode character set. In fact, you can type any character, both printable and non-printable, just by using their hexadecimal Unicode sequences as escape sequences. To use this format, you use \x as your escape character followed by the appropriate Unicode. For example, the escape sequence \x56 is the letter V.

Here is an example that uses a few of these codes.

#include <iostream>;

int main()
{
std::printf("This\nis\na\ntest\n\nShe said, \"How are you?\"\n");
}

The example would output as:

This
is
a
test

She said, "How are you?"

As you see, you implement escape codes as if they were ordinary characters in your strings. There is nothing more to them.

HTML Escape Sequences

While most languages use the C escape codes above, there are a few notable exceptions including HTML.

HTML uses two different sets of escape sequences. The first, known as HTML character entities, is used directly in HTML code to represent special characters. The other set is used by web browsers to allow special and non-English characters in web addresses.

For most people, you only have to know the HTML character codes. These escape sequences use the ampersand (&) as the escape character, and must end in a semicolon (;). The following table includes the most common HTML escape sequences. Out of all of them, you will most likely use &nbsp; the most. It means non-breaking space, and it creates a space that web browsers cannot ignore.

DisplayNameHTML EntityUnicode
Non-breaking space&nbsp;
<Less than&lt;<
>More than&gt;>
&Ampersand&amp;&
Double quotation marks&quot;
©Copyright&copy;©
®Registered trademark&reg;®
Trademark (USA)&trade;
×Multiplication sign&times;×
÷Division sign&divide;÷

 

As you can see from the table, you can use Unicode here as well by using the sequence &# as your escape character. While this list only covers the surface, you can find complete lists of HTML character entities just with a simple internet search.

HTML character entities work like the C escape sequences do. You insert them into your strings, attributes, and text. Web browsers take it from there. Just remember that HTML escape sequences MUST end in a semicolon or your web pages may not display as you expect. Also note that HTML codes are as case sensitive as their C counterparts.

As for the URL character entities, they just Unicode sequences preceded by the percent (%) symbol.

You can find the complete list of HTML character entities at Udemy

Other nonstandard sets of escape sequences

While you are most likely to encounter the above two sets of sequences, there are a few non-standard sets out there with their own escape characters and sequence formats. The most common of these alternatives is simply double typing the characters you need. Generally used by BASIC programming languages such as Visual BASIC, you write these sequences by  typing each escaped character twice. As you can only double-type printable characters, these programing languages often come with special string constants for non-printable characters such as tab and newline.

See how to use escape sequences in Visual Basic at Udemy

The Bottom Line

Escape sequences exist to help us embed non-printable characters or hardware controls into our programs. While every programming language has its own escape rules, most sequences start with a special escape character followed by the actual character code. Some languages such as HTML require ending characters as well. While you may never have to use them, escape sequences let your programs display punctuation marks and format their outputs.