A regular expression defines a search pattern for strings. This pattern may match one or several times or not at all for a given string. The abbreviation for regular expression is regex. Regular expressions can be used to search, edit, and manipulate text.
Regex meta characters are special symbols that carry a particular meaning and define the pattern in a regular expression. These characters allow us to create flexible and powerful search patterns for text processing.
1. List of Regex Meta Characters
Here’s a list of common regex meta-characters and their meanings:
| Meta Character | Description | 
|---|---|
| . | Matches any single character except a newline. | 
| ^ | Matches the start of a string. | 
| $ | Matches the end of a string. | 
| * | Matches zero or more repetitions of the preceding element. | 
| + | Matches one or more repetitions of the preceding element. | 
| ? | Matches zero or one repetition of the preceding element (makes it optional). | 
| {n} | Matches exactly nrepetitions of the preceding element. | 
| {n,} | Matches nor more repetitions of the preceding element. | 
| {n,m} | Matches between nandmrepetitions of the preceding element. | 
| \ | Escapes a meta-character, allowing it to be treated as a literal character. | 
| [] | Matches any single character within the brackets. | 
| [^] | Matches any single character not within the brackets. | 
| () | Groups multiple tokens together and captures the matched text. | 
| \d | Matches any digit, equivalent to [0-9]. | 
| \D | Matches any non-digit character, equivalent to [^0-9]. | 
| \w | Matches any word character (alphanumeric + underscore), equivalent to [a-zA-Z0-9_]. | 
| \W | Matches any non-word character, equivalent to [^a-zA-Z0-9_]. | 
| \s | Matches any whitespace character (spaces, tabs, line breaks). | 
| \S | Matches any non-whitespace character. | 
2. Regex Meta Characters Example
Let us see a few examples of using the meta characters in regular expressions and matching them.
2.1. Dot (.) Meta Character
The dot meta-character matches any single character except for a newline (\n). It is useful to match a pattern where the character can be anything.
Pattern pattern = Pattern.compile(".at");
Matcher matcher = pattern.matcher("cat bat rat sat mat");
while (matcher.find()) {
    System.out.println(matcher.group());  // cat, bat, rat, sat, mat
}2.2. Caret (^) and Dollar ($) Meta Characters
The caret (^) matches the start of a string, and the dollar sign ($) matches the end of a string. These are used to ensure that the pattern matches the beginning or the end of the string, respectively.
// Matches "Hello" only if it is at the start of the string
Pattern pattern = Pattern.compile("^Hello");
Matcher matcher = pattern.matcher("Hello world");
System.out.println(matcher.find()); // true// Matches "world" only if it is at the end of the string
pattern = Pattern.compile("world$");
matcher = pattern.matcher("Hello world");
System.out.println(matcher.find()); // true2.3. Asterisk (*), Plus (+), and Question Mark (?) Meta Characters
- *: Matches zero or more repetitions of the preceding element.
- +: Matches one or more repetitions of the preceding element.
- ?: Matches zero or one repetition of the preceding element (makes it optional).
// Matches "a", "aa", "aaa", etc.
Pattern pattern = Pattern.compile("a*");
Matcher matcher = pattern.matcher("aaab");
while (matcher.find()) {
    System.out.println(matcher.group());   // aaa
}
// Matches "a", "aa", "aaa", etc., but at least one "a"
pattern = Pattern.compile("a+");
matcher = pattern.matcher("aaab");
while (matcher.find()) {
    System.out.println(matcher.group());   // aaa
}
// Matches "a" or "ab"
pattern = Pattern.compile("ab?");
matcher = pattern.matcher("ab");
while (matcher.find()) {
    System.out.println(matcher.group());   // ab
}2.4. Braces ({}) Meta Characters
Braces are used to specify the exact number of repetitions for the preceding element.
- {n}: Exactly- nrepetitions.
- {n,}: At least- nrepetitions.
- {n,m}: Between- nand- mrepetitions.
// Matches exactly 3 "a"s
Pattern pattern = Pattern.compile("a{3}");
Matcher matcher = pattern.matcher("aaab");
while (matcher.find()) {
    System.out.println(matcher.group());  // aaa
}
// Matches 2 or more "a"s
pattern = Pattern.compile("a{2,}");
matcher = pattern.matcher("aaaa");
while (matcher.find()) {
    System.out.println(matcher.group());   // aaaa
}
// Matches between 2 and 3 "a"s
pattern = Pattern.compile("a{2,3}");
matcher = pattern.matcher("aaa");
while (matcher.find()) {
    System.out.println(matcher.group());  // aaa
}2.5. Square Brackets ([]) Meta Characters
Square brackets are used to define a character class, matching any single character within the brackets.
- [abc]: Matches any single character ‘a’, ‘b’, or ‘c’.
- [^abc]: Matches any single character except ‘a’, ‘b’, or ‘c’.
// Matches "a", "b", or "c"
Pattern pattern = Pattern.compile("[abc]");
Matcher matcher = pattern.matcher("a1b2c3");
while (matcher.find()) {
    System.out.println(matcher.group());  // Matches 'a', 'b', 'c'
}
// Matches any character except "a", "b", or "c"
pattern = Pattern.compile("[^abc]");
matcher = pattern.matcher("a1b2c3");
while (matcher.find()) {
    System.out.println(matcher.group());   // Matches '1', '2', '3'
}3. Escaping Meta Characters with Backslash (\)
A backslash is used to escape a meta-character, making it a literal character in the pattern. For example, \\. matches a literal dot (‘.’) character.
// Matches the literal dot character
Pattern pattern = Pattern.compile("\\.");
Matcher matcher = pattern.matcher("1.2.3");
while (matcher.find()) {
    System.out.println(matcher.group());
}In this Java regex example, we learned to use meta characters in regular expressions to evaluate text strings.
Happy Learning !!
 
					
Comments