Python String split()

Python example to split a string into a list of tokens using the delimiters such as space, comma, regex, or multiple delimiters.

1. Python split(separator, maxsplit) Syntax

The syntax of split method is:

string.split(separator, maxsplit)
  • Above both parameters are optional.
  • The seperator is the separator to use for splitting the string. By default, any whitespace (space, tab etc.) is a separator.
  • The maxsplit specifies the maximum number of splits to do. The default value is -1, which is “all occurrences”

2. Default Behavior

By default, the split() method breaks the strings into a list of unlimited tokens and the default separator is any whitespace. In the following example, the string contains un-even spaces between the words

>>> str = 'how to do in       java'
 
>>> str.split()     # split string using default delimiter and max splits
 
['how', 'to', 'do', 'in', 'java'] #Output

3. Split by Comma

In the following example, we are using the comma as a delimiter for splitting the string.

>>> str = 'how,to,do,in,java'
 
>>> str.split(',')     # split string using delimiter comma
 
['how', 'to', 'do', 'in', 'java'] #Output

4. Splitting with Multiple Delimiters

The split() method of string objects is really meant for very simple cases, and does not allow for multiple delimiters or account for possible whitespace around the delimiters.

In cases when you need a bit more flexibility, use the re.split() method:

>>> import re
 
>>> line = 'how to; do, in,java,      dot, com'
 
>>> re.split(r'[;,\s]\s*', line) # split with delimiters comma, semicolon and space 
                                             # followed by any amount of extra whitespace.
 
['how', 'to', 'do', 'in', 'java', 'dot', 'com']

When using re.split(), we need to be a bit careful should the regular expression pattern involve a capture group enclosed in parentheses. If capture groups are used, then the matched text is also included in the result.

For example, watch what happens here:

>>> import re
 
>>> line = 'how to; do, in,java,      dot, com'
 
>>> re.split(r'(;|,|\s)\s*', line) # split with delimiters comma, semicolon and space 
                                               # followed by any amount of extra whitespace.
 
['how', ' ', 'to', ';', 'do', ',', 'in', ',', 'java', ',', 'dot', ',', 'com']

Happy Learning !!

Comments

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.

Our Blogs

REST API Tutorial

Dark Mode

Dark Mode