In this java regex tutorial, we will learn to use regular expressions to test whether a user has entered a valid International Standard Book Number (ISBN).
Valid International Standard Book Number (ISBN)
The International Standard Book Number (ISBN) is a 13-digit (or 10 digits as well) number that uniquely identifies books and book-like products published internationally. The purpose of the ISBN is to establish and identify one title or edition of a title from one specific publisher and is unique to that edition, allowing for more efficient marketing of products by booksellers, libraries, universities, wholesalers and distributors.
Every ISBN consists of thirteen digits (or 10 digits) and whenever it is printed it is preceded by the letters ISBN. The number is divided into four parts of variable length, each part separated by a hyphen.
The four parts of an ISBN are as follows:
- Group or country identifier which identifies a national or geographic grouping of publishers;
- Publisher identifier which identifies a particular publisher within a group;
- Title identifier which identifies a particular title or edition of a title;
- Check digit is the single digit at the end of the ISBN which validates the ISBN.
All of the following can be considered as examples of valid ISBNs:
ISBN 978-0-596-52068-7
ISBN-13: 978-0-596-52068-7
978 0 596 52068 7
9780596520687
ISBN-10 0-596-52068-9
0-596-52068-9
Regex to Validate ISBNs
To validate ISBNs, our regex would be:
Regex for ISBN-10 : ^(?:ISBN(?:-10)?:? )?(?=[0-9X]{10}$|(?=(?:[0-9]+[- ]){3})
[- 0-9X]{13}$)[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9X]$
Regex for ISBN-13 : ^(?:ISBN(?:-13)?:? )?(?=[0-9]{13}$|(?=(?:[0-9]+[- ]){4})[- 0-9]{17}$)
97[89][- ]?[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9]$
Regex for ISBN-10 or ISBN-13 : ^(?:ISBN(?:-1[03])?:? )?(?=[0-9X]{10}$|(?=(?:[0-9]+[- ]){3})
[- 0-9X]{13}$|97[89][0-9]{10}$|(?=(?:[0-9]+[- ]){4})[- 0-9]{17}$)
(?:97[89][- ]?)?[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9X]$
Note: You cannot validate an ISBN using a regex alone, because the last digit is computed using a checksum algorithm. The regular expressions in this section validate the format of an ISBN only.
Now let’s test our ISBN regex using some demo ISBN numbers.
Validate ISBN-10 Formats Only
List<String> isbns = new ArrayList<String>(); //Valid ISBNs isbns.add("0-596-52068-9"); isbns.add("0 512 52068 9"); isbns.add("ISBN-10 0-596-52068-9"); isbns.add("ISBN-10: 0-596-52068-9"); //Invalid ISBNs isbns.add("0-5961-52068-9"); isbns.add("11 5122 52068 9"); isbns.add("ISBN-13 0-596-52068-9"); isbns.add("ISBN-10- 0-596-52068-9"); String regex = "^(?:ISBN(?:-10)?:? )?(?=[0-9X]{10}$|(?=(?:[0-9]+[- ]){3})[- 0-9X]{13}$)[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9X]$"; Pattern pattern = Pattern.compile(regex); for (String isbn : isbns) { Matcher matcher = pattern.matcher(isbn); System.out.println(matcher.matches()); } Output: true true true true false false false false
Validate ISBN-13 Formats Only
List<String> isbns = new ArrayList<String>(); //Valid ISBNs isbns.add("ISBN 978-0-596-52068-7"); isbns.add("ISBN-13: 978-0-596-52068-7"); isbns.add("978 0 596 52068 7"); isbns.add("9780596520687"); //Invalid ISBNs isbns.add("ISBN 11978-0-596-52068-7"); isbns.add("ISBN-12: 978-0-596-52068-7"); isbns.add("978 10 596 52068 7"); isbns.add("119780596520687"); String regex = "^(?:ISBN(?:-13)?:? )?(?=[0-9]{13}$|(?=(?:[0-9]+[- ]){4})[- 0-9]{17}$)97[89][- ]?[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9]$"; Pattern pattern = Pattern.compile(regex); for (String isbn : isbns) { Matcher matcher = pattern.matcher(isbn); System.out.println(matcher.matches()); } Output: true true true true false false false false
Validate ISBN-10 AND ISBN-13 Formats Both
List<String> isbns = new ArrayList<String>(); //Valid ISBNs isbns.add("ISBN 978-0-596-52068-7"); isbns.add("ISBN-13: 978-0-596-52068-7"); isbns.add("978 0 596 52068 7"); isbns.add("9780596520687"); isbns.add("0-596-52068-9"); isbns.add("0 512 52068 9"); isbns.add("ISBN-10 0-596-52068-9"); isbns.add("ISBN-10: 0-596-52068-9"); //Invalid ISBNs isbns.add("ISBN 11978-0-596-52068-7"); isbns.add("ISBN-12: 978-0-596-52068-7"); isbns.add("978 10 596 52068 7"); isbns.add("119780596520687"); isbns.add("0-5961-52068-9"); isbns.add("11 5122 52068 9"); isbns.add("ISBN-11 0-596-52068-9"); isbns.add("ISBN-10- 0-596-52068-9"); String regex = "^(?:ISBN(?:-1[03])?:? )?(?=[0-9X]{10}$|(?=(?:[0-9]+[- ]){3})[- 0-9X]{13}$|97[89][0-9]{10}$|(?=(?:[0-9]+[- ]){4})[- 0-9]{17}$)(?:97[89][- ]?)?[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9X]$"; Pattern pattern = Pattern.compile(regex); for (String isbn : isbns) { Matcher matcher = pattern.matcher(isbn); System.out.println(matcher.matches()); } Output: true true true true true true true true false false false false false false false false
I will advise to play with above simple regular expression to try more variation of ISBNs and let me know your findings.
Happy Learning !!
References :
http://en.wikipedia.org/wiki/International_Standard_Book_Number
http://www.isbn.org/faqs_general_questions