Python is a simple high level programming language with very powerful capabilities. It is an interpreted language based on object oriented programming. It is highly interactive and is easy to learn for first time programmers. What’s more, with a free license available for its source code, it is gaining popularity and giving languages like PERL and RUBY a run for their money. Python can be installed on all the popular platforms like WINDOWS, MAC and LINUX/UNIX. It can be used as a interpreted scripting language or a compiled byte-code for big applications. It can be easily used to write web programs, or even develop your own games.
In this tutorial we delve into the world of STRINGS using Python. Yes it is a world in itself! In Python, Strings are further classified into Basic strings, Multiline strings, Raw strings and Unicode strings.
A basic string is a list of alphanumeric characters (one character is that which a user creates typically with a single stroke on the keyboard) or, it can just be empty or a null string. In Python, basic strings are encapsulated in “ “or ‘ ’. For example, “1234” is a string while 1234 is a numeric.
Multiline Strings are used when the user needs to include a new-line character in the string. The string is encapsulated within three triple quotes or three single quotes i.e.– either “”” or ‘’’. For example, the following three lines form a single string.
“””This is a short tutorial on Python. I hope you enjoy this tutorial and learn to exploit the power of Python”””
‘’’ This is a multiline string using three single quotes’’’
Note that in both examples, the string is spread across atleast 2 lines.
Raw strings are used when the user wants to operate on strings that have backslashes (\) e.g. windows directory paths. When the string is defined as a Raw string, Python does not honor the backslash as a special character. To define a raw string, you need to prefix r to the string. For instance, the string in question is “Hello \ there”. We should define the string as r’Hello \there!’. The user can change any basic or multiline string to a raw string by prefixing the string with r (before the “” encapsulation). It will become clearer in the example below. First lets look at a regular string behavior with backslash –
#!/usr/bin/python print “c:\\documents” Output - c:\documents
Note that only one “\” got printed as the print statement considered the first “\” as a special character.
Now lets look at the same example as a Raw String. Note that \\ got printed as \\
#!/usr/bin/python print r“c:\\documents” Output – c:\\documents
A Unicode string is stored as a 16 bit Unicode (v/s 8 bit for other strings). This useful to accommodate different languages (Mandarin, Japanese etc that have thousands of characters). A Unicode string is declared by prefixing u to any strings– u’ or u“ or u’’’ or u”””. e.g. u’Hello there!’.
Python has many built-in methods for string manipulation and searches. If you want to search if a string has a particular sub-string, there are many ways to do it depending on what kind of result you want.
The in & not_in methods
You can use the in and not_in methods that return a Boolean response of True or False. For example
>”!” in “Hello there!”
>”World” in ‘Hello there!’
> “!” not_in “Hello there!”
> ”World” not_in ‘Hello there!’
The find & rfind methods
The find and rfind method return the lowest index where the sub-string is found. The find method reads the string from left to right and rfind methods reads the string from right to left. If the sub string is not found -1 is returned.
str is the main string
substr is the sub-string that you want to search
beg is the index where you want the search to begin
end is the index where you want the search to end
>str1 =’Hello there!’ >substr= “th” >substr1 = ‘12’ >str1.find(substr,0,len(substr)) 6 >str1.rfind(substr,0,len(substr)) 4 >str1.rfind(substr1,0,len(substr)) -1
The index & rindex methods
The index and rindex methods are similar to find and rfind except that if the substring is not found an error message is returned.
>str1 =’Hello there!’ >substr= “th” >substr1 = ‘12’ >str1.index(substr,0,len(substr)) 6 >str1.rindex(substr,0,len(substr)) 4 >str1.rindex(substr1,0,len(substr)) ValueError: Substring not found
The count method
Returns the number of occurrences of a substring in a string.
str is the main string
substr is the sub-string that you want to count
>str1 = “Hello there!” >str2 = “e” >str1.count(str2,0,11) 3
The startswith & endswith method
Returns a TRUE/FALSE if the string starts with or ends with a particular substring,. The start and end index for the search is optional.
>’Hello there!’ endswith(‘!’) True >’Hello there!’ startswith(‘!’) False >’Hello there!’ startswith(‘t’,6,11) True
Methods with boolean results
The following String functions return a Boolean output . The general syntax of all of them is here explained here.
str is the main string that is being tested
method() is the string method that returns a Boolean.
The isalnum method helps find if the string is all numeric.
The isalpha method helps find if the string is all alphanumeric
The isdigit method helps find if the string is all digits only
The islower & isupper method checks if the formatting of the string.
>’is this all lower case”.islower()
>’is this all upper case”.isupper()
The isspace method checks if there is a space in the string.
The istitle method checks if the string has been formatted like a title.
>’Is this a title’.istitle()
Remember that strings are case sensitive. You will not be able to match a lowercase a with an uppercase A. Python has very powerful features for effective String searches and is being used in various real world applications. Now go install Python & start pulling those “strings”!