In Lucene, WildcardQuery class is used to execute wildcard-based searches on Lucene indexes. The wildcard queries can be slow in runtime, as they need to iterate over many terms. To prevent the performance hit, a wildcard term should not start with the wildcard asterisk (*)
.
1. Maven
Start with adding these Lucene dependencies. We are using Lucene 9.10.0 and Java 21.
<properties>
<maven.compiler.source>21</maven.compiler.source>
<maven.compiler.target>21</maven.compiler.target>
<lucene.version>9.10.0</lucene.version>
</properties>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analysis-common</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-highlighter</artifactId>
<version>${lucene.version}</version>
</dependency>
2. Supported Wildcards
Lucene supports the following wildcards in the search terms:
Wildcard | Usage |
---|---|
* | matches any character sequence (including the empty one) |
? | matches any single character |
'\' | escape character |
3. Lucene WildcardQuery Example
In this example, I am reusing the indexes created in the Lucene example. If you want to learn more about creating Lucene indexes with text files, follow the linked article.
import java.nio.file.Paths;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Sort;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.WildcardQuery;
import org.apache.lucene.search.uhighlight.UnifiedHighlighter;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
public class WildcardQueryExample {
//This contains the lucene indexed documents
private static final String INDEX_DIR = "c:/temp/lucene/indexedFiles";
private static String searchTerm_1 = "prefer*";
private static String searchTerm_2 = "prefer??d";
public static void main(String[] args) throws Exception {
//Get directory reference
Directory dir = FSDirectory.open(Paths.get(INDEX_DIR));
//Index reader - an interface for accessing a point-in-time view of a lucene index
IndexReader reader = DirectoryReader.open(dir);
//Create lucene searcher. It search over a single IndexReader.
IndexSearcher searcher = new IndexSearcher(reader);
//analyzer with the default stop words
Analyzer analyzer = new StandardAnalyzer();
/**
* Wildcard "*" Example
* */
//Create wildcard query
Query query = new WildcardQuery(new Term("contents", searchTerm_1));
//Search the lucene documents
TopDocs hits = searcher.search(query, 10, Sort.INDEXORDER);
System.out.println("Search terms found in :: " + hits.totalHits + " files");
UnifiedHighlighter highlighter = new UnifiedHighlighter(searcher, analyzer);
String[] fragments = highlighter.highlight("contents", query, hits);
for (String f : fragments) {
System.out.println(f);
}
/**
* Wildcard "?" Example
* */
//Create wildcard query
query = new WildcardQuery(new Term("contents", searchTerm_2));
//Search the lucene documents
hits = searcher.search(query, 10, Sort.INDEXORDER);
System.out.println("Search terms found in :: " + hits.totalHits + " files");
highlighter = new UnifiedHighlighter(searcher, analyzer);
fragments = highlighter.highlight("contents", query, hits);
for (String f : fragments) {
System.out.println(f);
}
dir.close();
}
}
The program output:
Search terms found in :: 1 hits files
Questions explained agreeable <b>preferred</b> strangers too him her son.
Search terms found in :: 1 hits files
Questions explained agreeable <b>preferred</b> strangers too him her son.
Drop me your questions related to Lucene WildcardQuery class usage in the comments section.
Happy Learning !!
Comments