Python Strings

Learn about strings in Python and how to perform various operations on the strings with simple examples.

1. Creating a String

In Python, a string literal is:

an array of bytes representing unicode characters
surrounded by either single quotation marks, or double quotation marks
of unlimited length

str = 'hello world'

str = "hello world"

A multi-line string is created using three single quotes or three double quotes.

str = '''Say hello
		to python
		programming'''

str = """Say hello
		to python
		programming"""

Python does not have a character data type, a single character is simply a string of length 1.

2. Substring or Slicing

We can get a range of characters by using the slice syntax. Indexes start from 0. For example, str[m:n] returns a string from position 2 (including) to 5 (excluding).

str = 'hello world'

print(str[2:5])	# llo

Negative slicing returns the substring from the end. For example, str[-m:-n] returns string from position -5 (excluding) to -2 (including).

str = 'hello world'

print(str[-5:-2])	# wor

3. String as an Array

In python, strings behave as arrays. Square brackets can be used to access elements of the string.

str = 'hello world'

print(str[0])	# h
print(str[1])	# e
print(str[2])	# l

print(str[20])	# IndexError: string index out of range

4. String Length

The len() function returns the length of a string:

str = 'hello world'

print(len(str))	# 11

5. String Formatting

To format s string in python, use placeholders { } in string at desired places. Pass arguments to format() function to format the string with values.

We can pass the argument position in placeholders (starting with zero).

age = 36
name = 'Lokesh'

txt = "My name is {} and my age is {}"

print(txt.format(name, age))	# My name is Lokesh and my age is 36

txt = "My age is {1} and the name is {0}"

print(txt.format(name, age))	# My age is 36 and the name is Lokesh

6. String Methods

6.1. capitalize()

It returns a string where the very first character of given string is converted to UPPER CASE. When the first character is non-alphabet, it returns the same string.

name = 'lokesh gupta'

print( name.capitalize() )	# Lokesh gupta

txt = '38 yrs old lokesh gupta'

print( txt.capitalize() )	# 38 yrs old lokesh gupta

6.2. casefold()

It returns a string where all the characters are lowercase of a given string.

txt = 'My Name is Lokesh Gupta'

print( txt.casefold() )	# my name is lokesh gupta

6.3. center()

It center align the string, using a specified character (space is the default) as the fill character.

In given example, output takes total 20 characters and “hello world” is in the middle of it.

txt = "hello world"

x = txt.center(20)

print(x)	# '    hello world     '

6.4. count()

It returns the number of times a specified value appears in the string. It comes in two forms:

count(value) – value to search for in the string.
count(value, start, end) – value to search for in the string, where the search starts from start position till end position.

txt = "hello world"

print( txt.count("o") )			# 2

print( txt.count("o", 4, 7) )	# 1

6.5. encode()

It encodes the string, using the specified encoding. If no encoding is specified, UTF-8 will be used.

txt = "My name is åmber"

x = txt.encode()

print(x)	# b'My name is \xc3\xa5mber'

6.6. endswith()

It returns True if the string ends with the specified value, otherwise False.

txt = "hello world"

print( txt.endswith("world") )		# True

print( txt.endswith("planet") )		# False

6.7. expandtabs()

It sets the tab size to the specified number of whitespaces.

txt = "hello\tworld"

print( txt.expandtabs(2) )		# 'hello world'

print( txt.expandtabs(4) )		# 'hello   world'

print( txt.expandtabs(16) )		# 'hello           world'

6.8. find()

It finds the first occurrence of the specified value. It returns -1 if the specified value is not found in the string.

find() is same as the index() method, only difference is that the index() method raises an exception if the value is not found.

txt = "My name is Lokesh Gupta"

x = txt.find("e")

print(x)		# 6

6.9. format()

It formats the specified string and insert argument values inside the string’s placeholders.

age = 36
name = 'Lokesh'

txt = "My name is {} and my age is {}"

print( txt.format(name, age) )	# My name is Lokesh and my age is 36

6.10. format_map()

It returns a dictionary key’s value to format a string with named placeholders.

params = {'name':'Lokesh Gupta', 'age':'38'} 

txt = "My name is {name} and age is {age}"

x = txt.format_map(params)

print(x)		# My name is Lokesh Gupta and age is 38

6.11. index()

It finds the first occurrence of the specified value in the given string.
It raises an exception if the value to be searched is not found.

txt = "My name is Lokesh Gupta"

x = txt.index("e")

print(x)		# 6

x = txt.index("z")	# ValueError: substring not found

6.12. isalnum()

It checks an alphanumeric string. It returns True if all the characters are alphanumeric, meaning alphabet letters (a-zA-Z) and numbers (0-9).

print("LokeshGupta".isalnum())		# True

print("Lokesh Gupta".isalnum())		# False - Contains space

6.13. isalpha()

It returns True if all the characters are alphabets, meaning alphabet letters (a-zA-Z).

print("LokeshGupta".isalpha())			# True

print("Lokesh Gupta".isalpha())			# False - Contains space

print("LokeshGupta38".isalpha())		# False - Contains numbers

6.14. isdecimal()

It returns the code if all the characters are decimals (0-9). Else returns False.

print("LokeshGupta".isdecimal())	# False

print("12345".isdecimal())			# True

print("123.45".isdecimal())			# False - Contains 'point'

print("1234 5678".isdecimal())		# False - Contains space

6.15. isdigit()

It returns True if all the characters are digits, otherwise False. Exponents are also considered to be a digit.

print("LokeshGupta".isdigit())		# False

print("12345".isdigit())			# True

print("123.45".isdigit())			# False - contains decimal point

print("1234\u00B2".isdigit())		# True - unicode for square 2

6.16. isidentifier()

It returns True if the string is a valid identifier, otherwise False.

A valid identifier only contains alphanumeric letters (a-z) and (0-9), or underscores ( _ ). It cannot start with a number, or contain any spaces.

print( "Lokesh_Gupta_38".isidentifier() )		# True

print( "38_Lokesh_Gupta".isidentifier() )		# False - Start with number

print( "_Lokesh_Gupta".isidentifier() )			# True

print( "Lokesh Gupta 38".isidentifier() )		# False - Contain spaces

6.17. islower()

It returns True if all the characters are in lower case, otherwise False. Numbers, symbols and spaces are not checked, only alphabet characters.

print( "LokeshGupta".islower() )		# False

print( "lokeshgupta".islower() )		# True

print( "lokesh_gupta".islower() )		# True

print( "lokesh_gupta_38".islower() )	# True

6.18. isnumeric()

It method returns True if all the characters are numeric (0-9), otherwise False. Exponents are also considered to be numeric values.

print("LokeshGupta".isnumeric())	# False

print("12345".isnumeric())			# True

print("123.45".isnumeric())			# False - contains decimal point

print("1234\u00B2".isnumeric())		# True - unicode for square 2

6.19. isprintable()

It returns True if all the characters are printable, otherwise False. Non-printable characters are used to indicate certain formatting actions, such as:

White spaces (considered an invisible graphic)
Carriage returns
Tabs
Line breaks
Page breaks
Null characters

print("LokeshGupta".isprintable())		# True

print("Lokesh Gupta".isprintable())		# True

print("Lokesh\tGupta".isprintable())	# False

6.20. isspace()

It returns True if all the characters in a string are whitespaces, otherwise False.

6.21. istitle()

It returns True if all words in a text start with a upper case letter, AND the rest of the word are lower case letters, i.e. Title Case. Otherwise False.

print("Lokesh Gupta".istitle())		# True

print("Lokesh gupta".istitle())		# False

6.22. isupper()

It returns True if all the characters are in upper case, otherwise False. Numbers, symbols and spaces are not checked, only alphabet characters.

print("LOKESHGUPTA".isupper())		# True

print("LOKESH GUPTA".isupper())		# True

print("Lokesh Gupta".isupper())		# False

6.23. join()

It takes all items in an iterable and joins them into one string using the mandatory specified separator.

myTuple = ("Lokesh", "Gupta", "38")

x = "#".join(myTuple)

print(x)	# Lokesh#Gupta#38

6.24. ljust()

This method will left align the string, using a specified character (space is default) as the fill character.

txt = "lokesh"

x = txt.ljust(20, "-")

print(x)	# lokesh--------------

6.25. lower()

It method returns a string where all characters are lower case. Symbols and Numbers are ignored.

txt = "Lokesh Gupta"

x = txt.lower()

print(x)	# lokesh gupta

6.26. lstrip()

It method removes any leading characters (space is the default).

txt = "#Lokesh Gupta"

x = txt.lstrip("#_,.")

print(x)	# Lokesh Gupta

6.27. maketrans()

It creates a one to one mapping of a character to its translation/replacement. This translation mapping is then used for replacing a character to its mapped character when used in translate() method.

dict = {"a": "123", "b": "456", "c": "789"}

string = "abc"

print(string.maketrans(dict))	# {97: '123', 98: '456', 99: '789'}

6.28. partition()

It searches for a specified string in given text, and splits the string into a tuple containing three elements:

The first element contains the part before the specified string.
The second element contains the specified string.
The third element contains the part after the string.

txt = "my name is lokesh gupta"

x = txt.partition("lokesh")

print(x)	# ('my name is ', 'lokesh', ' gupta')

print(x[0])	# my name is
print(x[1])	# lokesh
print(x[2])	#  gupta

6.29. replace()

It replaces a specified phrase with another specified phrase. It comes in two forms:

string.replace(oldvalue, newvalue)
string.replace(oldvalue, newvalue, count) – ‘count’ specifies how many occurrences you want to replace. Default is all occurrences.

txt = "A A A A A"

x = txt.replace("A", "B")

print(x)	# B B B B B

x = txt.replace("A", "B", 2)

print(x)	# B B A A A

6.30. rfind()

It finds the last occurrence of the specified value. It returns -1 if the value is not found in given text.

txt = "my name is lokesh gupta"

x = txt.rfind("lokesh")		

print(x)		# 11

x = txt.rfind("amit")		

print(x)		# -1

6.31. rindex()

It finds the last occurrence of the specified value and raises an exception if the value is not found.

txt = "my name is lokesh gupta"

x = txt.rindex("lokesh")		

print(x)				# 11

x = txt.rindex("amit")	# ValueError: substring not found

6.32. rjust()

It will right align the string, using a specified character (space is default) as the fill character.

txt = "lokesh"

x = txt.rjust(20,"#")

print(x, "is my name")	# ##############lokesh is my name

6.33. rpartition()

It searches for the last occurrence of a specified string, and splits the string into a tuple containing three elements.

The first element contains the part before the specified string.
The second element contains the specified string.
The third element contains the part after the string.

txt = "my name is lokesh gupta"

x = txt.rpartition("lokesh")

print(x)	# ('my name is ', 'lokesh', ' gupta')

print(x[0])	# my name is
print(x[1])	# lokesh
print(x[2])	#  gupta

6.34. rsplit()

It splits a string into a list, starting from the right.

txt = "apple, banana, cherry"

x = txt.rsplit(", ")

print(x)	# ['apple', 'banana', 'cherry']

6.35. rstrip()

It removes any trailing characters (characters at the end a string), space is the default trailing character.

txt = "     lokesh     "

x = txt.rstrip()

print(x)	# '     lokesh'

6.36. split()

It splits a string into a list. You can specify the separator. The default separator is whitespace.

txt = "my name is lokesh"

x = txt.split()

print(x)	# ['my', 'name', 'is', 'lokesh']

6.37. splitlines()

It splits a string into a list, by splitting at line breaks.

txt = "my name\nis lokesh"

x = txt.splitlines()

print(x)	# ['my name', 'is lokesh']

6.38. startswith()

It returns True if the string starts with the specified value, otherwise False. String comparison is case-sensitive.

txt = "my name is lokesh"

print( txt.startswith("my") )	# True

print( txt.startswith("My") )	# False

6.39. strip()

It removes all leading (spaces at the beginning) and trailing (spaces at the end) characters (space is the default).

txt = "   my name is lokesh   "

print( txt.strip() )	# 'my name is lokesh'

6.40. swapcase()

It returns a string where all the upper case letters are lower case and vice versa.

txt = "My Name Is Lokesh Gupta"

print( txt.swapcase() )	# mY nAME iS lOKESH gUPTA

6.41. title()

It returns a string where the first character in every word is upper case. If the word contains a number or a symbol in start, the first letter after that will be converted to upper case.

print( "lokesh gupta".title() )	# Lokesh Gupta

print( "38lokesh gupta".title() )	# 38Lokesh Gupta

print( "1. lokesh gupta".title() )	# Lokesh Gupta

6.42. translate()

It takes the translation table to replace/translate characters in the given string as per the mapping table.

translation = {97: None, 98: None, 99: 105}

string = "abcdef"	

print( string.translate(translation) )	# idef

6.43. upper()

It returns a string where all characters are in upper case. Symbols and Numbers are ignored.

txt = "lokesh gupta"

print( txt.upper() )	# LOKESH GUPTA

6.44. zfill()

It adds zeros (0) at the beginning of the string until it reaches the specified length.

txt = "100"

x = txt.zfill(10)

print( 0000000100 )	# 0000000100

Happy Learning !!