edu.mayo.informatics.indexer.api
Class IndexerService

java.lang.Object
  extended by edu.mayo.informatics.indexer.api.IndexerService

public class IndexerService
extends java.lang.Object

This class will sit on top of multiple indexes, and manage them for you.

Author:
Dan Armbrust

Constructor Summary
IndexerService(java.lang.String rootLocation)
          Create an indexer service on a directory.
IndexerService(java.lang.String rootLocation, boolean configureLog4j)
          Create an indexer service on a directory.
 
Method Summary
 void addDocument(java.lang.String indexName, org.apache.lucene.document.Document document)
          Add a document to the index.
 void addDocument(java.lang.String indexName, org.apache.lucene.document.Document document, org.apache.lucene.analysis.Analyzer analyzer)
          Add a document to the index.
 void closeBatchRemover(java.lang.String indexName)
          Closes the currently open IndexReader.
 void closeWriter(java.lang.String indexName)
          Closes the currently open Index.
 void createIndex(java.lang.String indexName)
          Create a new index, using a default analyzer, and no stopwords.
 void createIndex(java.lang.String indexName, org.apache.lucene.analysis.Analyzer analyzer)
          Create a new index, using a user specified analyzer.
 void createIndex(java.lang.String indexName, java.lang.String[] stopwords)
          Create a new index, using the default analyzer, and your supplied stopwords.
 void deleteIndex(java.lang.String indexName)
          Delete an index from this indexerService.
 void forceUnlockIndex(java.lang.String indexName)
           
 SearchServiceInterface getIndexSearcher(java.lang.String indexName)
           
 SearchServiceInterface getIndexSearcher(java.lang.String[] indexNames, boolean parallelSearch)
           
 SearchServiceInterface getIndexSearcher(java.lang.String[] indexNames, boolean useInMemoryIndex, boolean parallelSearch)
           
 SearchServiceInterface getIndexSearcher(java.lang.String indexName, boolean useInMemoryIndex)
           
 LuceneIndexReader getLuceneIndexReader(java.lang.String indexName)
           
 LuceneIndexReader getLuceneIndexReader(java.lang.String indexName, boolean useInMemoryIndex)
           
 MetaData getMetaData()
           
 java.lang.String getRootLocation()
           
 java.lang.String[] listIndexes()
          List all of the indexes in this indexservice.
 void openBatchRemover(java.lang.String indexName)
          Opens a reader on the index.
 void openBatchWriter(java.lang.String indexName, boolean clearContents, boolean useRAMIndexer)
          Opens a new batchIndexWriter.
 void openWriter(java.lang.String indexName, boolean clearContents)
          Opens an index writer.
 void optimizeIndex(java.lang.String indexName)
          Run the low level lucene optimize command on an index.
 void refreshAvailableIndexes()
           
 int removeDocument(java.lang.String indexName, java.lang.String documentIdentifier)
          Use this method to remove a document from an index.
 int removeDocument(java.lang.String indexName, java.lang.String field, java.lang.String fieldValue)
          This will remove all documents from an index where fieldValue = field.
 void setDocsPerTempIndex(java.lang.String indexName, int docs)
          How many documents to write out per temporary index.
 void setMaxBufferedDocs(java.lang.String indexName, int docs)
          How many documents to buffer before merging.
 void setMaxFieldLength(java.lang.String indexName, int size)
          Lucene will truncate fields longer than this.
 void setMaxMergeDocs(java.lang.String indexName, int docs)
          See the lucene documentation.
 void setMergeFactor(java.lang.String indexName, int mergeFactor)
          How many documents to add in memory before writing to the index.
 void setUseCompoundFile(java.lang.String indexName, boolean bool)
          Whether or not to use the new compound file format.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

IndexerService

public IndexerService(java.lang.String rootLocation,
                      boolean configureLog4j)
               throws InternalErrorException
Create an indexer service on a directory. A Indexer Service can contain multiple indexes.

Parameters:
rootLocation - The directory where all of your indexes are located
configureLog4j - Whether or not to configure a log4j appender.
Throws:
InternalErrorException

IndexerService

public IndexerService(java.lang.String rootLocation)
               throws InternalErrorException
Create an indexer service on a directory. A Indexer Service can contain multiple indexes.

Parameters:
rootLocation - The directory where all of your indexes are located
Throws:
InternalErrorException
Method Detail

refreshAvailableIndexes

public void refreshAvailableIndexes()
                             throws InternalErrorException
Throws:
InternalErrorException

createIndex

public void createIndex(java.lang.String indexName,
                        java.lang.String[] stopwords)
Create a new index, using the default analyzer, and your supplied stopwords.

Parameters:
indexName - Name of index - will match folder name created to hold it
stopwords - Words not to index

createIndex

public void createIndex(java.lang.String indexName)
Create a new index, using a default analyzer, and no stopwords.

Parameters:
indexName - Name of index - will match folder name created to hold it

createIndex

public void createIndex(java.lang.String indexName,
                        org.apache.lucene.analysis.Analyzer analyzer)
Create a new index, using a user specified analyzer. When you do searches on an index created with an analyzer of your choosing, you MUST supply the same analyzer to the searcher. You will get invalid results from your searches if you don't.

Parameters:
indexName - Name of index - will match folder name created to hold it
analyzer - The analyzer to use while creating the index. See the Lucene Documentation.

deleteIndex

public void deleteIndex(java.lang.String indexName)
                 throws IndexNotFoundException,
                        InternalIndexerErrorException
Delete an index from this indexerService.

Parameters:
indexName -
Throws:
IndexNotFoundException
InternalIndexerErrorException

openBatchWriter

public void openBatchWriter(java.lang.String indexName,
                            boolean clearContents,
                            boolean useRAMIndexer)
                     throws IndexNotFoundException,
                            IndexWriterAlreadyOpenException,
                            InternalIndexerErrorException
Opens a new batchIndexWriter. Use this for large updates.

Parameters:
indexName -
clearContents - true erases current content, false appends.
useRAMIndexer - true for a ram indexer, false for a file system indexer
Throws:
IndexNotFoundException
IndexWriterAlreadyOpenException
InternalIndexerErrorException

openWriter

public void openWriter(java.lang.String indexName,
                       boolean clearContents)
                throws IndexNotFoundException,
                       IndexWriterAlreadyOpenException,
                       InternalIndexerErrorException
Opens an index writer. Best for small updates. Use a Batch Writer for large updates.

Parameters:
indexName -
clearContents - True erases the current index. False appends to it.
Throws:
IndexNotFoundException
IndexWriterAlreadyOpenException
InternalIndexerErrorException

closeWriter

public void closeWriter(java.lang.String indexName)
                 throws IndexNotFoundException,
                        InternalIndexerErrorException
Closes the currently open Index.

Parameters:
indexName -
Throws:
IndexNotFoundException
InternalIndexerErrorException

closeBatchRemover

public void closeBatchRemover(java.lang.String indexName)
                       throws IndexNotFoundException,
                              InternalIndexerErrorException
Closes the currently open IndexReader.

Parameters:
indexName -
Throws:
IndexNotFoundException
InternalIndexerErrorException

openBatchRemover

public void openBatchRemover(java.lang.String indexName)
                      throws IndexNotFoundException,
                             InternalIndexerErrorException
Opens a reader on the index. This is really only needed for batch deletes.

Parameters:
indexName - The index to open
Throws:
IndexNotFoundException
InternalIndexerErrorException

addDocument

public void addDocument(java.lang.String indexName,
                        org.apache.lucene.document.Document document)
                 throws IndexNotFoundException,
                        InternalIndexerErrorException,
                        DocumentMissingUniqueDocumentIdentifierException
Add a document to the index. Note: If you construct your own document (rather than using a provided document generator, you must add a field to your document of Index.UNIQUE_DOCUMENT_IDENTIFIER_FIELD and a unique value. If the values are not unique, you will not be able to easily remove documents from the index. If you are going to be adding multiple items, it will be much faster to call openWriter or openBatchWriter before you do your additions, and call closeWriter after you do them.

Parameters:
indexName - The index to add the document to.
document - The document to add
Throws:
IndexNotFoundException
InternalIndexerErrorException
IndexWriterNotOpenException
DocumentMissingUniqueDocumentIdentifierException

optimizeIndex

public void optimizeIndex(java.lang.String indexName)
                   throws IndexNotFoundException,
                          InternalIndexerErrorException
Run the low level lucene optimize command on an index. This is usually only necessary after a large amount of deletes from an index.

Parameters:
indexName -
Throws:
IndexNotFoundException
InternalIndexerErrorException

addDocument

public void addDocument(java.lang.String indexName,
                        org.apache.lucene.document.Document document,
                        org.apache.lucene.analysis.Analyzer analyzer)
                 throws IndexNotFoundException,
                        InternalIndexerErrorException,
                        DocumentMissingUniqueDocumentIdentifierException
Add a document to the index. Note: If you construct your own document (rather than using a provided document generator, you must add a field to your document of Index.UNIQUE_DOCUMENT_IDENTIFIER_FIELD and a unique value. If the values are not unique, you will not be able to easily remove documents from the index. If you are going to be adding multiple items, it will be much faster to call openWriter or openBatchWriter before you do your additions, and call closeWriter after you do them.

Parameters:
indexName - The index to add the document to.
document - The document to add
analyzer - The analyzer to use
Throws:
IndexNotFoundException
InternalIndexerErrorException
IndexWriterNotOpenException
DocumentMissingUniqueDocumentIdentifierException

removeDocument

public int removeDocument(java.lang.String indexName,
                          java.lang.String documentIdentifier)
                   throws IndexNotFoundException,
                          InternalIndexerErrorException,
                          OperatorErrorException
Use this method to remove a document from an index. You may not have a index writer open while you do this. If you are going to do multiple deletions, you will get much better performance if you call openBatchRemover before your deletions, and call closeBatchRemover after your removals.

Parameters:
indexName -
documentIdentifier - The document identifier that was used to add the document.
Returns:
The number of documents removed
Throws:
IndexNotFoundException
IndexReaderNotOpenException
InternalIndexerErrorException
OperatorErrorException

removeDocument

public int removeDocument(java.lang.String indexName,
                          java.lang.String field,
                          java.lang.String fieldValue)
                   throws IndexNotFoundException,
                          InternalIndexerErrorException,
                          OperatorErrorException
This will remove all documents from an index where fieldValue = field. You may not have a index writer open while you do this. If you are going to do multiple deletions, you will get much better performance if you call openBatchRemover before your deletions, and call closeBatchRemover after your removals.

Parameters:
indexName - The index to remove from.
field - The field to look for.
fieldValue - The value to match.
Returns:
The number of documents removed.
Throws:
IndexNotFoundException
IndexReaderNotOpenException
InternalIndexerErrorException
OperatorErrorException

listIndexes

public java.lang.String[] listIndexes()
List all of the indexes in this indexservice.

Returns:
All of the indexes that exist

getIndexSearcher

public SearchServiceInterface getIndexSearcher(java.lang.String[] indexNames,
                                               boolean parallelSearch)
                                        throws InternalIndexerErrorException,
                                               IndexNotFoundException
Throws:
InternalIndexerErrorException
IndexNotFoundException

getIndexSearcher

public SearchServiceInterface getIndexSearcher(java.lang.String[] indexNames,
                                               boolean useInMemoryIndex,
                                               boolean parallelSearch)
                                        throws InternalIndexerErrorException,
                                               IndexNotFoundException
Throws:
InternalIndexerErrorException
IndexNotFoundException

getLuceneIndexReader

public LuceneIndexReader getLuceneIndexReader(java.lang.String indexName)
                                       throws IndexNotFoundException,
                                              InternalIndexerErrorException
Throws:
IndexNotFoundException
InternalIndexerErrorException

getLuceneIndexReader

public LuceneIndexReader getLuceneIndexReader(java.lang.String indexName,
                                              boolean useInMemoryIndex)
                                       throws IndexNotFoundException,
                                              InternalIndexerErrorException
Throws:
IndexNotFoundException
InternalIndexerErrorException

getIndexSearcher

public SearchServiceInterface getIndexSearcher(java.lang.String indexName)
                                        throws InternalIndexerErrorException,
                                               IndexNotFoundException
Throws:
InternalIndexerErrorException
IndexNotFoundException

getIndexSearcher

public SearchServiceInterface getIndexSearcher(java.lang.String indexName,
                                               boolean useInMemoryIndex)
                                        throws InternalIndexerErrorException,
                                               IndexNotFoundException
Throws:
InternalIndexerErrorException
IndexNotFoundException

getRootLocation

public java.lang.String getRootLocation()

forceUnlockIndex

public void forceUnlockIndex(java.lang.String indexName)
                      throws IndexNotFoundException,
                             InternalIndexerErrorException
Throws:
IndexNotFoundException
InternalIndexerErrorException

getMetaData

public MetaData getMetaData()

setDocsPerTempIndex

public void setDocsPerTempIndex(java.lang.String indexName,
                                int docs)
                         throws IndexNotFoundException,
                                OperatorErrorException
How many documents to write out per temporary index. Used for performance/controlling open file handles. Only used on writers.

Parameters:
indexName -
docs -
Throws:
IndexNotFoundException
OperatorErrorException

setMaxBufferedDocs

public void setMaxBufferedDocs(java.lang.String indexName,
                               int docs)
                        throws IndexNotFoundException,
                               OperatorErrorException
How many documents to buffer before merging. Only used on writers. See Lucene documentation.

Parameters:
indexName -
docs -
Throws:
IndexNotFoundException
OperatorErrorException

setMaxFieldLength

public void setMaxFieldLength(java.lang.String indexName,
                              int size)
                       throws IndexNotFoundException,
                              OperatorErrorException
Lucene will truncate fields longer than this. Only used on writers

Parameters:
indexName -
size -
Throws:
IndexNotFoundException
OperatorErrorException

setMaxMergeDocs

public void setMaxMergeDocs(java.lang.String indexName,
                            int docs)
                     throws IndexNotFoundException,
                            OperatorErrorException
See the lucene documentation. Probably unneeded. Only used on writers.

Parameters:
indexName -
docs -
Throws:
IndexNotFoundException
OperatorErrorException

setMergeFactor

public void setMergeFactor(java.lang.String indexName,
                           int mergeFactor)
                    throws IndexNotFoundException,
                           OperatorErrorException
How many documents to add in memory before writing to the index. Has large affects on performance. Only used on writers.

Parameters:
indexName -
mergeFactor -
Throws:
IndexNotFoundException
OperatorErrorException

setUseCompoundFile

public void setUseCompoundFile(java.lang.String indexName,
                               boolean bool)
                        throws IndexNotFoundException,
                               OperatorErrorException
Whether or not to use the new compound file format. Reduces the number of open files, but also reduces the indexing performance.

Parameters:
indexName -
bool -
Throws:
IndexNotFoundException
OperatorErrorException

Copyright: (c) 2004-2006 Mayo Foundation for Medical Education and Research (MFMER). All rights reserved. MAYO, MAYO CLINIC, and the triple-shield Mayo logo are trademarks and service marks of MFMER.