In this Java regex tutorial, we will learn to test whether the number of words in input text is within a certain minimum and maximum limit.
1. Regular Expression
The following regex is very similar to the previous tutorial of limiting the number of non-whitespace characters, except that each repetition matches an entire word rather than a single non-whitespace character. It matches between 2 and 10 words, skipping past any non-word characters, including punctuation and whitespace:
^\\W*(?:\\w+\\b\\W*){2,10}$
The regex matches a string that:
- Starts with any number of non-word characters (or no non-word characters at all).
- Contains between 2 and 10 words (each word consisting of one or more word characters).
- Each word is followed by a word boundary (
\\b). - After each word, there may be any number of non-word characters (including spaces, punctuation, etc.).
- The string must end after the 2nd to 10th word.
Example Matches:
"Hello, world!"(2 words)"one-two-three 4 5"(5 words)" test , input "(2 words, with spaces and commas)
Example Non-Matches:
"One"(only 1 word, which is fewer than 2)"This is a really long sentence with too many words"(too many than 10 words)
2. Java Example
The following Java program demonstrates the usage of Pattern and Matcher classes for compiling and executing a regex.
String regex = "^\\W*(?:\\w+\\b\\W*){2,10}$"; // Regex to limit to 3 words
Pattern pattern = Pattern.compile(regex);
// Test input
String input = "Hello World Java";
// Check if the input matches the regex
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println("Valid input: " + input); // Prints this
} else {
System.out.println("Invalid input: " + input);
}
I will advise you to play with the above simple regular expression to try more variation.
Happy Learning !!
Comments