Java regex word boundary – match lines starts with and ends with

Sometimes we have a requirement where we have to filter out lines from logs, which start from certain word OR end with certain word. In this Java regex word boundary tutorial, we will learn to create regex to filter out lines which either start or end with a certain word.

Table of Contents

1. Boundary matchers
2. Match word at the start of content
3. Match word at the end of content
4. Match word at the start of line
5. Match word at the end of line

1. Boundary matchers

Boundary macthers help to find a particular word, but only if it appears at the beginning or end of a line. They do not match any characters. Instead, they match at certain positions, effectively anchoring the regular expression match at those positions.

The following table lists and explains all the boundary matchers.

Boundary token
Description
^
The beginning of a line
$
The end of a line
\b
A word boundary
\B
A non-word boundary
\A
The beginning of the input
\G
The end of the previous match
\Z
The end of the input but for the final terminator, if any
\z
The end of the input

2. Java regex word boundary – Match word at the start of content

The anchor "\A" always matches at the very start of the whole text, before the first character. That is the only place where it matches. Place "\A" at the start of your regular expression to test whether the content begins with the text you want to match.

The "A" must be uppercase. Alternatively, you can use "^" as well.

^wordToSearch OR \AwordToSearch

String content = 	"begin here to start, and go there to end\n" +
					"come here to begin, and end there to finish\n" +
					"begin here to start, and go there to end";
					
String regex 	= 	"^begin";
//OR
//String regex = "\\Abegin";

Pattern pattern = 	Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = 	pattern.matcher(content);
while (matcher.find())
{
	System.out.print("Start index: " + matcher.start());
	System.out.print(" End index: " + matcher.end() + " ");
	System.out.println(matcher.group());
}

Output:

Start index: 0 End index: 5 begin

3. Java regex word boundary – Match word at the end of content

The anchors "\Z" and "\z" always match at the very end of the content, after the last character. Place "\Z" or "\z" at the end of your regular expression to test whether the content ends with the text you want to match.

Alternatively, you can use "$" as well.

wordToSearch$ OR wordToSearch\Z

String content = 	"begin here to start, and go there to end\n" +
					"come here to begin, and end there to finish\n" +
					"begin here to start, and go there to end";
					
String regex 	= 	"end$";
String regex 	= 	"end\\Z";

Pattern pattern = 	Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = 	pattern.matcher(content);
while (matcher.find())
{
	System.out.print("Start index: " + matcher.start());
	System.out.print(" End index: " + matcher.end() + " ");
	System.out.println(matcher.group());
}

Output:

Start index: 122 End index: 125 end

4. Java regex word boundary – Match word at the start of line

You can use "(?m)" to tun on “multi-line” mode to match a word at start of every time.

“Multi-line” mode affects only the caret (^) and dollar ($) sign.

(?m)^wordToSearch

String content = 	"begin here to start, and go there to end\n" +
					"come here to begin, and end there to finish\n" +
					"begin here to start, and go there to end";
String regex 	= 	"(?m)^begin";
Pattern pattern = 	Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = 	pattern.matcher(content);
while (matcher.find())
{
	System.out.print("Start index: " + matcher.start());
	System.out.print(" End index: " + matcher.end() + " ");
	System.out.println(matcher.group());
}

Output:

Start index: 0 End index: 5 begin
Start index: 85 End index: 90 begin

5. Java regex word boundary – Match word at the end of line

You can use "(?m)" to tun on “multi-line” mode to match a word at end of every time.

(?m)wordToSearch$

String content = 	"begin here to start, and go there to end\n" +
					"come here to begin, and end there to finish\n" +
					"begin here to start, and go there to end";
String regex 	= 	"(?m)end$";
Pattern pattern = 	Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = 	pattern.matcher(content);
while (matcher.find())
{
	System.out.print("Start index: " + matcher.start());
	System.out.print(" End index: " + matcher.end() + " ");
	System.out.println(matcher.group());
}

Output:

Start index: 37 End index: 40 end
Start index: 122 End index: 125 end

Let me know of your thoughts on this Java regex word boundary example.

Happy Learning !!

References:

Java regex docs

Was this post helpful?

Join 7000+ Fellow Programmers

Subscribe to get new post notifications, industry updates, best practices, and much more. Directly into your inbox, for free.

1 thought on “Java regex word boundary – match lines starts with and ends with”

  1. I have a doubt. Imagine, In a file, if i want all the data that start in a String and ends whit another string, what can i do ?
    Thank you

    Reply

Leave a Comment

HowToDoInJava

A blog about Java and its related technologies, the best practices, algorithms, interview questions, scripting languages, and Python.