In this tutorial, we will learn to match any character which is part of “Greek Extended” unicode block or Greek script.
Solution Regex(s) : \\p{InGreek} and \p{InGreekExtended}
Match any character in Greek script
Let’s look at an example program which is able to match any character in Greek script in a string.
String content = "A math equation might be α + β = λ + γ"; String regex = "\\p{InGreek}"; Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE); Matcher matcher = pattern.matcher(content); while (matcher.find()) { System.out.print("Start index: " + matcher.start()); System.out.print(" End index: " + matcher.end() + " "); System.out.println(" : " + matcher.group()); } Output: Start index: 25 End index: 26 : α Start index: 29 End index: 30 : β Start index: 33 End index: 34 : λ Start index: 37 End index: 38 : γ
Match any character in “Greek Extended” unicode block
Let’s look at an example program which is able to match any character in Greek script in a string.
String content = "Let's learn some new greek extended characters : ᾲ , ᾨ etc."; String regex = "\\p{InGreekExtended}"; Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE); Matcher matcher = pattern.matcher(content); while (matcher.find()) { System.out.print("Start index: " + matcher.start()); System.out.print(" End index: " + matcher.end() + " "); System.out.println(" : " + matcher.group()); } Output: Start index: 49 End index: 50 : ᾲ Start index: 53 End index: 54 : ᾨ
References:
http://en.wikipedia.org/wiki/Greek_alphabet
http://www.alanwood.net/unicode/greek_extended.html
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html