Regex Boundary Matcher: Matching Begin-with / End-with

Sometimes we have a requirement where we have to filter out lines from logs, which start from a certain word OR end with a certain word. In this Java regex word boundary tutorial, we will learn to create a regex to filter out lines that either start or end with a certain word.

1. Boundary Matchers

Boundary matchers are special characters or sequences used in regular expressions (regex) to match specific positions within a string or text. Boundary matchers do not match any actual characters but instead match positions or boundaries between characters, effectively anchoring the regular expression match at those positions.

The following table lists and explains all the boundary matchers.

MatcherDescriptionExample ExpressionMatches/Not Matches
^The beginning of a line^HelloMatches lines starting with “Hello”
$The end of a lineworld$Matches lines ending with “world”
\bA word boundary\bcat\bMatches the whole word “cat”
Does not match “catch” or “category”
\BA non-word boundary\Bcat\BMatches “catch” or “category”
Does not match the whole word “cat”
\AThe beginning of the input\AHelloMatches strings starting with “Hello”
\GThe end of the previous match\GwordMatches “word” immediately after the previous match
\ZThe end of the input
(excluding final line terminator)
world\ZMatches “world” at the end of the input string
\zThe end of the inputworld\zMatches “world” only at the very end of the input string

2. Regex Boundary Matcher Example

The following Java code example demonstrates the use of each boundary matcher symbol in regular expressions.

The Pattern.MULTILINE flag is used to enable multiline mode for the regex patterns.

import java.util.regex.*;

public class BoundaryMatcherExample {

    public static void main(String[] args) {

        // Define a multiline string using text block
        String input = """
            Hello world
            Goodbye world
            Catch a cat
            catamaran
            """;

        // Define a StringBuilder to capture the output
        StringBuilder output = new StringBuilder();

        // ^ - Beginning of Line
        Pattern beginningPattern = Pattern.compile("^Hello", Pattern.MULTILINE);
        Matcher beginningMatcher = beginningPattern.matcher(input);
        while (beginningMatcher.find()) {
            output.append("Match found (Beginning of Line): ").append(beginningMatcher.group()).append("\n");
        }

        // $ - End of Line
        Pattern endPattern = Pattern.compile("world$", Pattern.MULTILINE);
        Matcher endMatcher = endPattern.matcher(input);
        while (endMatcher.find()) {
            output.append("Match found (End of Line): ").append(endMatcher.group()).append("\n");
        }

        // \b - Word Boundary
        Pattern wordBoundaryPattern = Pattern.compile("\\bcat\\b", Pattern.MULTILINE);
        Matcher wordBoundaryMatcher = wordBoundaryPattern.matcher(input);
        while (wordBoundaryMatcher.find()) {
            output.append("Match found (Word Boundary): ").append(wordBoundaryMatcher.group()).append("\n");
        }

        // \B - Non-Word Boundary
        Pattern nonWordBoundaryPattern = Pattern.compile("\\Bcat\\B", Pattern.MULTILINE);
        Matcher nonWordBoundaryMatcher = nonWordBoundaryPattern.matcher(input);
        while (nonWordBoundaryMatcher.find()) {
            output.append("Match found (Non-Word Boundary): ").append(nonWordBoundaryMatcher.group()).append("\n");
        }

        // \A - Beginning of Input
        Pattern beginningInputPattern = Pattern.compile("\\AHello");
        Matcher beginningInputMatcher = beginningInputPattern.matcher(input);
        while (beginningInputMatcher.find()) {
            output.append("Match found (Beginning of Input): ").append(beginningInputMatcher.group()).append("\n");
        }

        // \G - End of Previous Match
        Pattern endPreviousPattern = Pattern.compile("\\Goo");
        Matcher endPreviousMatcher = endPreviousPattern.matcher(input);
        while (endPreviousMatcher.find()) {
            output.append("Match found (End of Previous Match): ").append(endPreviousMatcher.group()).append("\n");
        }

        // \Z - End of Input (excluding final line terminator)
        Pattern endInputPattern = Pattern.compile("world\\Z", Pattern.MULTILINE);
        Matcher endInputMatcher = endInputPattern.matcher(input);
        while (endInputMatcher.find()) {
            output.append("Match found (End of Input excluding final line terminator): ").append(endInputMatcher.group()).append("\n");
        }

        // \z - End of Input
        Pattern endInputAbsolutePattern = Pattern.compile("world\\z", Pattern.MULTILINE);
        Matcher endInputAbsoluteMatcher = endInputAbsolutePattern.matcher(input);
        while (endInputAbsoluteMatcher.find()) {
            output.append("Match found (End of Input): ").append(endInputAbsoluteMatcher.group()).append("\n");
        }

        // Print the captured output
        System.out.println(output.toString());
    }
}

The program output:

Match found (Beginning of Line): Hello
Match found (End of Line): world
Match found (Word Boundary): cat
Match found (Non-Word Boundary): cat
Match found (Non-Word Boundary): cat
Match found (Beginning of Input): Hello
Match found (End of Previous Match): o
Match found (End of Previous Match): o
Match found (End of Input excluding final line terminator): world
Match found (End of Input): world

This output indicates where each match was found in the multiline input string, according to the corresponding regex pattern.

Let me know your thoughts on this Java regex word boundary example.

Happy Learning !!

References: Java regex docs

Comments

Subscribe
Notify of
guest
1 Comment
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.