Apache Lucene’s ByteBuffersDirectory is a new in-memory directory implementation added in Lucene 8.4.0. Internally, it uses Java NIO’s ByteBuffer for efficient read/write in the underlying RAM memory.
- The ByteBuffersDirectory is quite useful for demo purposes that require fast, transient indexing and searching without persistent storage.
- If you are looking for fast memory-based indexes for your production application then consider using MMapDirectory as as it uses OS caches more effectively (through memory-mapped buffers).
The previously used RAMDirectory has been deprecated and is not recommended for usage.
1. Maven
Start with adding these Lucene dependencies. We are using Lucene 9.10.0 and Java 21.
<properties>
<maven.compiler.source>21</maven.compiler.source>
<maven.compiler.target>21</maven.compiler.target>
<lucene.version>9.10.0</lucene.version>
</properties>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analysis-common</artifactId>
<version>${lucene.version}</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>${lucene.version}</version>
</dependency>
2. Lucene ByteBuffersDirectory Example
The ByteBuffersDirectory class is an in-memory directory implementation. It stores the index files on the heap for quick access.
The following example indexes 4 documents with little content in them using indexDoc() method. Later, we search the term “happy” in the documents in the method searchIndex().
import java.io.IOException;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field.Store;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.ByteBuffersDirectory;
public class ByteBuffersDirectoryExample {
public static void main(String[] args) throws IOException {
//Create ByteBuffersDirectory instance
ByteBuffersDirectory byteBufferDir = new ByteBuffersDirectory();
//Builds an analyzer with the default stop words
Analyzer analyzer = new StandardAnalyzer();
//Write some docs to ByteBuffersDirectory
writeIndex(byteBufferDir, analyzer);
//Search indexed docs in ByteBuffersDirectory
searchIndex(byteBufferDir, analyzer);
}
static void writeIndex(ByteBuffersDirectory byteBufferDir, Analyzer analyzer) {
try {
// IndexWriter Configuration
IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
iwc.setOpenMode(OpenMode.CREATE);
//IndexWriter writes new index files to the directory
IndexWriter writer = new IndexWriter(byteBufferDir, iwc);
//Create some docs with name and content
indexDoc(writer, "document-1", "hello world");
indexDoc(writer, "document-2", "hello happy world");
indexDoc(writer, "document-3", "hello happy world");
indexDoc(writer, "document-4", "hello hello world");
//don't forget to close the writer
writer.close();
} catch (IOException e) {
//Any error goes here
e.printStackTrace();
}
}
static void indexDoc(IndexWriter writer, String name, String content) throws IOException {
Document doc = new Document();
doc.add(new TextField("name", name, Store.YES));
doc.add(new TextField("content", content, Store.YES));
writer.addDocument(doc);
}
static void searchIndex(ByteBuffersDirectory byteBufferDir, Analyzer analyzer) {
String searchTerm = "happy";
IndexReader reader = null;
try {
//Create Reader
reader = DirectoryReader.open(byteBufferDir);
//Create index searcher
IndexSearcher searcher = new IndexSearcher(reader);
//Build query
QueryParser qp = new QueryParser("content", analyzer);
Query query = qp.parse(searchTerm);
//Search the index
TopDocs foundDocs = searcher.search(query, 10);
// Total found documents
System.out.println("Total Results :: " + foundDocs.totalHits);
//Let's print found doc names and their content along with score
for (ScoreDoc sd : foundDocs.scoreDocs) {
Document d = searcher.doc(sd.doc);
System.out.println("Document Name : " + d.get("name")
+ " :: Content : " + d.get("content")
+ " :: Score : " + sd.score);
}
//don't forget to close the reader
reader.close();
} catch (IOException | ParseException e) {
//Any error goes here
e.printStackTrace();
}
}
}
The program output:
Total Results :: 2 hits
Document Name : document-2 :: Content : hello happy world :: Score : 0.30376968
Document Name : document-3 :: Content : hello happy world :: Score : 0.30376968
3. Lucene MMapDirectory Example
In Apache Lucene, another near-memory implementation is MMapDirectory. The MMapDirectory is a file-based directory implementation that uses memory-mapped files for storage. It takes advantage of the operating system’s virtual memory management to map files directly into memory thus providing fast access to index data.
The MMapDirectory is the best fit in usecases where we want to persist the indexes in the filesystem, and still want to take advantage of superfast access to indexes from memory.
In the code, there is hardly any difference between using a ByteBuffersDirectory or MMapDirectory. Both look exactly the same. The only difference is how we create the instance of MMapDirectory.
MMapDirectory directory = new MMapDirectory(Paths.get("/path/to/index"));
The following example uses the MMapDirectory for indexing and searching in the index files.
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field.Store;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.ByteBuffersDirectory;
import org.apache.lucene.store.MMapDirectory;
import java.io.IOException;
import java.nio.file.Path;
public class MMapDirectoryExample {
public static void main(String[] args) throws IOException {
MMapDirectory mmapDir = new MMapDirectory(Path.of("c:/temp", "lucene", "index"));
Analyzer analyzer = new StandardAnalyzer();
writeIndex(mmapDir, analyzer);
searchIndex(mmapDir, analyzer);
}
static void writeIndex(MMapDirectory mmapDir, Analyzer analyzer) {
try {
IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
iwc.setOpenMode(OpenMode.CREATE);
IndexWriter writer = new IndexWriter(mmapDir, iwc);
//Create some docs with name and content
indexDoc(writer, "document-1", "hello world");
indexDoc(writer, "document-2", "hello happy world");
indexDoc(writer, "document-3", "hello happy world");
indexDoc(writer, "document-4", "hello hello world");
writer.close();
} catch (IOException e) {
e.printStackTrace();
}
}
static void indexDoc(IndexWriter writer, String name, String content) throws IOException {
Document doc = new Document();
doc.add(new TextField("name", name, Store.YES));
doc.add(new TextField("content", content, Store.YES));
writer.addDocument(doc);
}
static void searchIndex(MMapDirectory mmapDir, Analyzer analyzer) {
String searchTerm = "happy";
IndexReader reader = null;
try {
reader = DirectoryReader.open(mmapDir);
IndexSearcher searcher = new IndexSearcher(reader);
QueryParser qp = new QueryParser("content", analyzer);
Query query = qp.parse(searchTerm);
TopDocs foundDocs = searcher.search(query, 10);
System.out.println("Total Results :: " + foundDocs.totalHits);
for (ScoreDoc sd : foundDocs.scoreDocs) {
Document d = searcher.doc(sd.doc);
System.out.println("Document Name : " + d.get("name")
+ " :: Content : " + d.get("content")
+ " :: Score : " + sd.score);
}
reader.close();
} catch (IOException | ParseException e) {
e.printStackTrace();
}
}
}
The program output:
Total Results :: 2 hits
Document Name : document-2 :: Content : hello happy world :: Score : 0.30376968
Document Name : document-3 :: Content : hello happy world :: Score : 0.30376968
4. Conclusion
As discussed in this Lucene tutorial, both ByteBuffersDirectory and MMapDirectory serve different purposes. The ByteBuffersDirectory is a great fit for fast, in-memory indexing and searching in demo applications that require transient indexes and test data. The MMapDirectory is suitable for production usecases where data needs to be persisted on disk, and we still require faster and more efficient in-memory type access to indexes.
Happy Learning !!
Comments