Python String Contains What, Exactly?: A Simple Guide

pythonstringcontainsPython is a simple high level programming language with very powerful capabilities. It is an interpreted language based on object oriented programming. It is highly interactive  and is easy to learn for first time programmers. What’s more,  with a free license available for its source code, it is gaining popularity and giving languages like PERL and RUBY a run for their money.  Python can be installed on all the popular platforms like WINDOWS, MAC and LINUX/UNIX.  It can be used as a interpreted scripting language or a compiled byte-code for big applications. It can be easily used to write web programs, or even develop your own games.

In this tutorial we delve into the world of  STRINGS using Python. Yes it is a world in itself! In Python, Strings are further classified into Basic strings, Multiline strings, Raw strings and Unicode strings.

Basic Strings

A basic string is a list of alphanumeric characters  (one character is that which a user creates typically with a single stroke on the keyboard) or, it can just be empty or a null string.  In Python, basic strings are encapsulated in “ “or ‘ ’. For example, “1234” is a string while 1234 is a numeric.

Multiline Strings

Multiline Strings are used when the user needs to include a new-line character in the string.  The string is encapsulated within three triple quotes or three single quotes i.e.– either “”” or ‘’’. For example, the following three lines form a single string.

“””This is a short tutorial on Python.
I hope you enjoy this tutorial and
learn to exploit the power of Python”””

OR

‘’’ This is a multiline
string using three single quotes’’’

Note that in both examples, the string is spread across atleast 2 lines.

Raw Strings

Raw strings are used when the user wants to operate on strings that have backslashes (\) e.g. windows directory paths.  When the string is defined as a Raw string, Python does not  honor the backslash as a special character.  To define a raw string, you need to prefix r to the string. For instance, the string in question is “Hello \ there”.  We should define the string as r’Hello \there!’. The user can change any basic or multiline string to a raw string by prefixing the string with r (before the “” encapsulation). It will become clearer in the example below. First lets look at a regular string behavior with backslash –

#!/usr/bin/python
print “c:\\documents”

Output -  c:\documents

Note that only one “\” got printed as the print statement considered the first “\” as a special character.

Now lets look at the same example as a Raw String.  Note that \\ got printed as \\

#!/usr/bin/python
print r“c:\\documents”

Output – c:\\documents

Unicode Strings

A Unicode string is stored as a 16 bit Unicode (v/s 8 bit for other strings). This useful to accommodate different languages (Mandarin, Japanese etc that have thousands of characters). A Unicode string is declared by prefixing u to any strings– u’ or u“  or u’’’ or u”””. e.g. u’Hello there!’.

String Search

Python has many built-in methods for string manipulation and searches. If you want to search if a string has a particular sub-string, there are many ways to do it depending on what kind of result you want.

The in & not_in methods

You can use the in and  not_in methods that return a Boolean response of True or False. For example

>”!” in “Hello there!”

True

>”World” in ‘Hello there!’

False

> “!” not_in “Hello there!”

False

> ”World” not_in ‘Hello there!’

True

The find & rfind methods

The find and rfind method return the lowest index where the sub-string is found. The find method reads the string from left to right and rfind methods reads the string from right to left. If the sub string is not found -1 is returned.

Syntax

str.find(substr,beg,end)
str.rfind(substr,beg,end)

str is the main string

substr is the sub-string that you want to search

beg is the index where you want the search to begin

end is the index where you want the search to end

Example

>str1 =’Hello there!’
>substr= “th”
>substr1 = ‘12’
>str1.find(substr,0,len(substr))
6
>str1.rfind(substr,0,len(substr))
4
>str1.rfind(substr1,0,len(substr))
-1

The index & rindex methods

The index and rindex methods are similar to find and rfind except that if the substring is not found an error message is returned.

Syntax

str.index(substr,beg,end)
str.rindex(substr,beg,end)

Example

>str1 =’Hello there!’
>substr= “th”
>substr1 = ‘12’
>str1.index(substr,0,len(substr))
6
>str1.rindex(substr,0,len(substr))
4
>str1.rindex(substr1,0,len(substr))
ValueError: Substring not found

The count method

Returns the number of occurrences of a substring in a string.

Syntax

>str.count(substr,start,end)

str is the main string

substr is the sub-string that you want to count

Example

>str1 = “Hello there!”
>str2 = “e”
>str1.count(str2,0,11)
3

The startswith & endswith method

Returns a TRUE/FALSE if the string starts with or ends with a particular substring,. The start and end index for the search is optional.

>’Hello there!’ endswith(‘!’)
True
>’Hello there!’ startswith(‘!’)
False
>’Hello there!’ startswith(‘t’,6,11)
True

Methods with boolean results

The following String functions return a Boolean output . The general syntax of all of them is here explained here.

>str.method()

str is the main string that is being tested

method() is the string method that returns a Boolean.

The isalnum method  helps find if the string is all numeric.

>”1234”.isalnum()

True

>”ABCD”.isalnum()

False

The isalpha method helps find if the string is all alphanumeric

>”1234”.isalpha()

True

>”A>B”.isalpha()

False

The isdigit method helps find if the string is all digits only

>”1232”.isdigit()

True

>”1+2=3”.isdigit()

False

The islower & isupper method checks if the formatting of the string.

>’is this all lower case”.islower()

True

>’is this all upper case”.isupper()

False

The isspace  method checks if there is a space in the string.

>’\t’.isspace()

True

The istitle method checks if the string has been formatted like a title.

>’Is this a title’.istitle()

False

Remember that strings are case sensitive. You will not be able to match a lowercase a with an uppercase A. Python has very powerful features for effective String searches and is being used in various real world applications. Now go install Python & start pulling those “strings”!