Java regex word boundary – match specific word or contain word

In this Java regex word boundary example, we will learn to match a specific word in a string. e.g. We will match “java” in “java is object oriented language”. But it should not match “javap” in “javap is another tool in JDL bundle”.

1. java regex word boundary matchers

Boundary matchers help to find a particular word, but only if it appears at the beginning or end of a line. They do not match any characters. Instead, they match at certain positions, effectively anchoring the regular expression match at those positions.

The following table lists and explains all the boundary matchers.

Boundary token
Description
^
The beginning of a line
$
The end of a line
\b
A word boundary
\B
A non-word boundary
\A
The beginning of the input
\G
The end of the previous match
\Z
The end of the input but for the final terminator, if any
\z
The end of the input

2. Java regex to match specific word

Solution Regex : \bword\b

The regular expression token "\b" is called a word boundary. It matches at the start or the end of a word. By itself, it results in a zero-length match.

Strictly speaking, “\b” matches in these three positions:

  • Before the first character in the data, if the first character is a word character
  • After the last character in the data, if the last character is a word character
  • Between two characters in the data, where one is a word character and the other is not a word character

To run a “spcific word only” search using a regular expression, simply place the word between two word boundaries.

String data1 = "Today, java is object oriented language";
      
String regex = "\\bjava\\b";

Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(data1);
while (matcher.find())
{
	System.out.print("Start index: " + matcher.start());
	System.out.print(" End index: " + matcher.end() + " ");
	System.out.println(matcher.group());
}

Output:

Start index: 7 End index: 11 java

Please note that matching above regex with “Also, javap is another tool in JDL bundle” doesn’t produce any result i.e. doesn’t match any place.

3. Java regex to match word with nonboundaries – contain word example

Suppose, you want to match “java” such that it should be able to match words like “javap” or “myjava” or “myjavaprogram” i.e. java word can lie anywhere in the data string. It could be start of word with additional characters in end, or could be in end of word with additional characters in start as well as in between a long word.

"\B" matches at every position in the subject text where "\B" does not match. "\B" matches at every position that is not at the start or end of a word.

To match such words, use below regex :

Solution Regex : \\Bword|word\\B

String data1 = "Searching in words : javap myjava myjavaprogram";
      
String regex = "\\Bjava|java\\B";

Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(data1);
while (matcher.find())
{
	System.out.print("Start index: " + matcher.start());
	System.out.print(" End index: " + matcher.end() + " ");
	System.out.println(matcher.group());
}

Output:

Start index: 21 End index: 25 java
Start index: 29 End index: 33 java
Start index: 36 End index: 40 java

Please note that it will not match “java” word in first example i.e. “Today, java is object oriented language” because “\\B” does not match start and end of a word.

3. Java regex to match word irrespective of boundaries

This is simplest usecase. You want to match “java” word in all four places in string “Searching in words : java javap myjava myjavaprogram”. To able to do so, simply don’t use anything.

Solution regex : word

String data1 = "Searching in words : java javap myjava myjavaprogram";
      
String regex = "java";

Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(data1);
while (matcher.find())
{
	System.out.print("Start index: " + matcher.start());
	System.out.print(" End index: " + matcher.end() + " ");
	System.out.println(matcher.group());
}

Output:

Start index: 21 End index: 25 java
Start index: 26 End index: 30 java
Start index: 34 End index: 38 java
Start index: 41 End index: 45 java

That’s all for this java regex contain word example related to boundary and non-boundary matches of a specific word using java regular expressions.

Happy Learning !!

References:

Java regex docs

Was this post helpful?

Join 7000+ Awesome Developers

Get the latest updates from industry, awesome resources, blog updates and much more.

* We do not spam !!

1 thought on “Java regex word boundary – match specific word or contain word”

Leave a Comment

HowToDoInJava

A blog about Java and related technologies, the best practices, algorithms, and interview questions.