Splitting a String

Learn to split or tokenize a string into an array. Splitting a string is a very common task especially working on web applications when we have to pass data in CSV format or separate based on some other separator such $, # or another separator.

1. Guava Splitter

The Splitter class is best. It looks good while writing and is re-usable also. we create a splitter and re-use it as many times as you want. So it helps in achieving uniform logic for splitter applications, for similar use-cases.

Another benefit is that it also provided some useful methods while building the splitter itself which eliminates a lot of after-work after creating the tokens itself as we saw in the above examples.

To build a beautiful splitter, write code like this:

Splitter niceCommaSplitter = Splitter.on(',').omitEmptyStrings().trimResults();

And now use it anywhere in code as you like:

Splitter niceCommaSplitter = Splitter.on(',').omitEmptyStrings().trimResults();

Iterable<String> tokens2 = niceCommaSplitter.split("I,am ,Legend, , oh ,you ?");
for(String token: tokens2){
 System.out.println(token);
}

For reference, we can download the Guava library from their project’s home project.

OR, you can directly include it as a maven dependency.

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>17.0</version>
</dependency>

2. StringUtils.split() from Apache Commons

The StringUtils class provides a lot of useful methods to perform common operations on Strings such as search, replace, reverse or check empty. All operations are null safe.

The split() is very much similar to the above approach and it also returns the String array output. The only benefit is that the code is faster.

split(String str, String separatorChars, int max)
  • str – the String to parse, may be null.
  • separatorChars – the characters used as the delimiters. The default value is whitespace. (Optional Partameter)
  • max – the maximum number of elements to include in the array. A zero or negative value implies no limit. (Optional Partameter)

Java program to split a string using StringUtils.

String[] tokens = StringUtils.split("I,am ,Legend, , oh ,you ?",",");

for (String token : tokens)
{
	System.out.println(token);
}

3. String.split() Method

String.split() method is a good way to split the strings. The tokens are returned in form of a string array that frees us to use it as we wish.

Java program to split a string with delimiter comma.

String[] tokens = "I,am ,Legend, , oh ,you ?".split(",");

for (String token : tokens)
{
	System.out.println(token);
}

4. StringTokenizer (Legacy)

Using StringTokenizer to split strings is really easy to use and has been for a long time. This class allows an application to break a string into tokens.

  • The methods in this class do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.
  • The set of delimiters may be specified either at creation time or on a per-token basis.
  • If not specified then the default delimiter set is " \t\n\r\f": the space character, the tab character, the newline character, the carriage-return character, and the form-feed character.

1.1. Constructor

public StringTokenizer(String str,
                       String delim,
                       boolean returnDelims)
  • str – a string to be parsed.
  • delim – the delimiters. (Optional parameter)
  • returnDelims – flag indicating whether to return the delimiters as tokens. (Optional parameter)

1.1. Single Delimiter

Java program to split string by whitespace example using the default delimiter.

String str = "I am sample string and will be tokenized on space";

StringTokenizer defaultTokenizer = new StringTokenizer(str);

System.out.println("Total number of tokens found : " + defaultTokenizer.countTokens());

while ( defaultTokenizer.hasMoreTokens() )
{
	System.out.println( defaultTokenizer.nextToken() );
}

1.3. Multiple Delimiters

This is really good usecase. It allows you to split strings where delimiters can be more than one.

String url = "https://howtodoinjava.com/java-initerview-questions";

StringTokenizer multiTokenizer = new StringTokenizer(url, "://.-");

while (multiTokenizer.hasMoreTokens())
{
	System.out.println( multiTokenizer.nextToken() );
}

As java docs says, StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

Happy Learning !!

Was this post helpful?

Join 8000+ Awesome Developers, Like YOU!

6 thoughts on “Splitting a String”

  1. Hi!

    Good day!

    I’m new to java and I have case study about tokens. basically I want to read a file a text file the count the number of tokens and occurrence. for example I have a text file called test.txt and it contains the below data:

    if (sizePlus1++ == max) {
    if (max == min) {

    I wan to check the line how many tokens and occurrence. output must be ,

    if = 2
    ( = 2
    sizePlus = 1
    1 = 1
    ++ = 1
    == = 2
    max = 2
    ) = 2
    { = 2

    and result should be either ascending or descending order by key or by value.

    Your prompt reply will be very much appreciated.

    Thank you very much,

    Reply
    • Hi Darvin,

      Please provide Sample input data and expected Output, so that I can understand what exactly you want to do.

      Thanks.
      Swapnil Solunke.

      Reply
  2. Hi Lokesh,

    Just to point a correction. You have copied the same code from String.split() in StringUtils.split() section.

    Regards,
    Moiz

    Reply

Leave a Comment

About HowToDoInJava

This blog provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions, and frequently asked interview questions.