edu.mayo.informatics.indexer.lucene
Class Index

java.lang.Object
  extended by edu.mayo.informatics.indexer.lucene.Index

public class Index
extends java.lang.Object

This is an abstracted view of an lucene index.

Author:
Dan Armbrust

Field Summary
static java.lang.String UNIQUE_DOCUMENT_IDENTIFIER_FIELD
           
 
Constructor Summary
Index(java.io.File location, org.apache.lucene.analysis.Analyzer analyzer)
          Opens an index using a provided analyzer.
Index(java.io.File location, java.lang.String[] stopWords)
          This constructor will open an index using a StandardAnalyzer.
 
Method Summary
 void addDocument(org.apache.lucene.document.Document document)
          Adds a document to the currently open indexWriter.
 void addDocument(org.apache.lucene.document.Document document, org.apache.lucene.analysis.Analyzer analyzer)
          Adds a document to the currently open indexWriter.
 void closeIndexReader()
           
 void closeIndexWriter()
          Closes the currently opened indexWriter.
 LuceneIndexReader getIndexReader()
           
 LuceneIndexReader getIndexReader(boolean useInMemoryIndex)
           
 java.io.File getLocation()
           
 void openBatchFSIndexWriter(boolean clearContents)
          Open the index for adding new documents.
 void openBatchRAMIndexWriter(boolean clearContents)
          Open the index for adding new documents.
 void openFSIndexWriter(boolean clearContents)
          Open the index for adding new documents.
 void openIndexReader()
           
 void openIndexReader(boolean useInMemoryIndex)
           
 void optimizeIndex()
           
 int removeDocument(java.lang.String uniqueDocumentIdentifier)
           
 int removeDocument(java.lang.String field, java.lang.String fieldValue)
           
 void setAnalyzer(org.apache.lucene.analysis.Analyzer analyzer)
          Change the analyzer of an index.
 void setDocsPerTempIndex(int i)
          How many documents to write out per temporary index.
 void setMaxBufferedDocs(int i)
          See the lucene documentation.
 void setMaxFieldLength(int i)
          Lucene will truncate fields longer than this.
 void setMaxMergeDocs(int i)
          See the lucene documentation.
 void setMergeFactor(int i)
          How many documents to add in memory before writing to the index.
 void setUseCompoundFile(boolean bool)
          Whether or not to use the new compound file format.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UNIQUE_DOCUMENT_IDENTIFIER_FIELD

public static final java.lang.String UNIQUE_DOCUMENT_IDENTIFIER_FIELD
See Also:
Constant Field Values
Constructor Detail

Index

public Index(java.io.File location,
             org.apache.lucene.analysis.Analyzer analyzer)
Opens an index using a provided analyzer.

Parameters:
location - Location on disk to index to.
analyzer - The analyzer to use while indexing

Index

public Index(java.io.File location,
             java.lang.String[] stopWords)
This constructor will open an index using a StandardAnalyzer.

Parameters:
location - Location on the disk to create the index
stopWords - Optional list of stopwords (words not to index) to use in the StandardAnalyzer.
Method Detail

openBatchRAMIndexWriter

public void openBatchRAMIndexWriter(boolean clearContents)
                             throws IndexWriterAlreadyOpenException,
                                    InternalIndexerErrorException
Open the index for adding new documents. Indexes to RAM first for performance reasons.

Parameters:
clearContents - True to erase current contents, false to append to them
Throws:
IndexWriterAlreadyOpenException - Thrown if a indexWriter is already open.
InternalIndexerErrorException - Thrown if an unexpected error occurs.

openBatchFSIndexWriter

public void openBatchFSIndexWriter(boolean clearContents)
                            throws InternalIndexerErrorException,
                                   IndexWriterAlreadyOpenException
Open the index for adding new documents. Indexes to the File System.

Parameters:
clearContents - True to erase current contents, false to append to them
Throws:
IndexWriterAlreadyOpenException - Thrown if a indexWriter is already open.
InternalIndexerErrorException - Thrown if an unexpected error occurs.

openFSIndexWriter

public void openFSIndexWriter(boolean clearContents)
                       throws InternalIndexerErrorException,
                              IndexWriterAlreadyOpenException
Open the index for adding new documents. Use for small updates. Use a Batch writer for large updates.

Parameters:
clearContents - True to erase current contents, false to append to them
Throws:
IndexWriterAlreadyOpenException - Thrown if a indexWriter is already open.
InternalIndexerErrorException - Thrown if an unexpected error occurs.

closeIndexWriter

public void closeIndexWriter()
                      throws InternalIndexerErrorException
Closes the currently opened indexWriter.

Throws:
InternalIndexerErrorException

openIndexReader

public void openIndexReader()
                     throws InternalIndexerErrorException
Throws:
InternalIndexerErrorException

openIndexReader

public void openIndexReader(boolean useInMemoryIndex)
                     throws InternalIndexerErrorException
Throws:
InternalIndexerErrorException

getIndexReader

public LuceneIndexReader getIndexReader()
                                 throws InternalIndexerErrorException
Throws:
InternalIndexerErrorException

getIndexReader

public LuceneIndexReader getIndexReader(boolean useInMemoryIndex)
                                 throws InternalIndexerErrorException
Throws:
InternalIndexerErrorException

closeIndexReader

public void closeIndexReader()
                      throws InternalIndexerErrorException
Throws:
InternalIndexerErrorException

addDocument

public void addDocument(org.apache.lucene.document.Document document)
                 throws InternalIndexerErrorException
Adds a document to the currently open indexWriter.

Parameters:
document - The document to add to the index.
Throws:
InternalIndexerErrorException
IndexWriterNotOpenException

addDocument

public void addDocument(org.apache.lucene.document.Document document,
                        org.apache.lucene.analysis.Analyzer analyzer)
                 throws InternalIndexerErrorException
Adds a document to the currently open indexWriter.

Parameters:
document - The document to add to the index.
analyzer - The analyzer to use
Throws:
InternalIndexerErrorException
IndexWriterNotOpenException

removeDocument

public int removeDocument(java.lang.String uniqueDocumentIdentifier)
                   throws InternalIndexerErrorException,
                          OperatorErrorException
Throws:
InternalIndexerErrorException
OperatorErrorException

removeDocument

public int removeDocument(java.lang.String field,
                          java.lang.String fieldValue)
                   throws OperatorErrorException,
                          InternalIndexerErrorException
Throws:
OperatorErrorException
InternalIndexerErrorException

optimizeIndex

public void optimizeIndex()
                   throws InternalIndexerErrorException
Throws:
InternalIndexerErrorException

setDocsPerTempIndex

public void setDocsPerTempIndex(int i)
                         throws OperatorErrorException
How many documents to write out per temporary index. Used for performance/controlling open file handles. Only used on writers.

Parameters:
i - - How many docs to add to the index before opening a new temporary index.
Throws:
OperatorErrorException

setMaxFieldLength

public void setMaxFieldLength(int i)
                       throws OperatorErrorException
Lucene will truncate fields longer than this. Only used on writers

Parameters:
i - The max length that a field can be.
Throws:
OperatorErrorException

setMaxBufferedDocs

public void setMaxBufferedDocs(int i)
                        throws OperatorErrorException
See the lucene documentation. Probably unneeded. Only used on writers.

Parameters:
i -
Throws:
OperatorErrorException

setMaxMergeDocs

public void setMaxMergeDocs(int i)
                     throws OperatorErrorException
See the lucene documentation. Probably unneeded. Only used on writers.

Parameters:
i -
Throws:
OperatorErrorException

setMergeFactor

public void setMergeFactor(int i)
                    throws OperatorErrorException
How many documents to add in memory before writing to the index. Has large affects on performance. Only used on writers.

Parameters:
i - How many docs to add before writing.
Throws:
OperatorErrorException

setUseCompoundFile

public void setUseCompoundFile(boolean bool)
                        throws OperatorErrorException
Whether or not to use the new compound file format. Reduces the number of open files, but also reduces the indexing performance.

Parameters:
bool -
Throws:
OperatorErrorException

getLocation

public java.io.File getLocation()
Returns:
The folder that contains this index.

setAnalyzer

public void setAnalyzer(org.apache.lucene.analysis.Analyzer analyzer)
Change the analyzer of an index. You MUST call this method if you constructed the index with an analyzer of your own. You always must use the same Analyzer.

Parameters:
analyzer -

Copyright: (c) 2004-2006 Mayo Foundation for Medical Education and Research (MFMER). All rights reserved. MAYO, MAYO CLINIC, and the triple-shield Mayo logo are trademarks and service marks of MFMER.