|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.analysis.Analyzer
edu.mayo.informatics.indexer.lucene.analyzers.EncoderAnalyzer
public class EncoderAnalyzer
This is an analyzer that generates codes for each token to index. Uses the Apache commons coded package.
Constructor Summary | |
---|---|
EncoderAnalyzer()
Create a new PhonetixAnalyzer. |
|
EncoderAnalyzer(org.apache.commons.codec.Encoder encoder)
The lvg config file location is required. |
|
EncoderAnalyzer(org.apache.commons.codec.Encoder encoder,
java.lang.String[] stopWords,
char[] charsToRemove,
char[] charsToTreatAsWhiteSpace)
Create a new EncoderAnalyzer - everything configured by the user. |
|
EncoderAnalyzer(java.lang.String[] stopWords,
char[] charsToRemove,
char[] charsToTreatAsWhiteSpace)
Create a new EncoderAnalyzer - uses a default configured DoubleMetaphone encoder. |
Method Summary | |
---|---|
WhiteSpaceLowerCaseAnalyzer |
getWhiteSpaceLowerCaseAnalyzer()
This method should not be part of the public API - but design requirements require it to be public. |
void |
setWhiteSpaceLowerCaseAnalyzer(WhiteSpaceLowerCaseAnalyzer whiteSpaceLowerCaseAnalyzer)
This method should not be part of the public API - but design requirements require it to be public. |
org.apache.lucene.analysis.TokenStream |
tokenStream(java.lang.String fieldname,
java.io.Reader reader)
Create a token stream for this analyzer. |
Methods inherited from class org.apache.lucene.analysis.Analyzer |
---|
getPositionIncrementGap, getPreviousTokenStream, reusableTokenStream, setPreviousTokenStream |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public EncoderAnalyzer()
and a DoubleMetaphone generator set to
the default values.
public EncoderAnalyzer(org.apache.commons.codec.Encoder encoder, java.lang.String[] stopWords, char[] charsToRemove, char[] charsToTreatAsWhiteSpace)
encoder
- - The encoder to use. DoubleMetaphone, Metaphone, Soundex,
etc.stopWords
- - Stop words to use - not used if null or empty.charsToRemove
- - characters to remove from input (before encoding) - not used
if null or empty.charsToTreatAsWhiteSpace
- - characters to treat as whitespace (split points) - defaults
to typical whitespace if null or empty.public EncoderAnalyzer(java.lang.String[] stopWords, char[] charsToRemove, char[] charsToTreatAsWhiteSpace)
stopWords
- - Stop words to use - not used if null or empty.charsToRemove
- - characters to remove from input (before encoding) - not used
if null or empty.charsToTreatAsWhiteSpace
- - characters to treat as whitespace (split points) - defaults
to typical whitespace if null or empty.public EncoderAnalyzer(org.apache.commons.codec.Encoder encoder)
encoder
- WhiteSpaceLowerCaseAnalyzer
Method Detail |
---|
public final org.apache.lucene.analysis.TokenStream tokenStream(java.lang.String fieldname, java.io.Reader reader)
tokenStream
in class org.apache.lucene.analysis.Analyzer
public WhiteSpaceLowerCaseAnalyzer getWhiteSpaceLowerCaseAnalyzer()
public void setWhiteSpaceLowerCaseAnalyzer(WhiteSpaceLowerCaseAnalyzer whiteSpaceLowerCaseAnalyzer)
|
Copyright: (c) 2004-2006 Mayo Foundation for Medical Education and Research (MFMER). All rights reserved. MAYO, MAYO CLINIC, and the triple-shield Mayo logo are trademarks and service marks of MFMER. | ||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |