edu.mayo.informatics.indexer.lucene.analyzers
Class SnowballAnalyzer

java.lang.Object
  extended by org.apache.lucene.analysis.Analyzer
      extended by edu.mayo.informatics.indexer.lucene.analyzers.SnowballAnalyzer

public class SnowballAnalyzer
extends org.apache.lucene.analysis.Analyzer

This is an analyzer that uses Snowball to stem each term before it is inserted into the index.

Author:
Dan Armbrust

Constructor Summary
SnowballAnalyzer()
          Create a new SnowballAnalyzer.
SnowballAnalyzer(boolean keepOrigional, java.lang.String snowballName)
          Create a Snowball analyzer.
SnowballAnalyzer(boolean keepOrigional, java.lang.String snowballName, java.lang.String[] stopWords, char[] charsToRemove, char[] charsToTreatAsWhiteSpace)
          Create a Snowball analyzer.
 
Method Summary
 WhiteSpaceLowerCaseAnalyzer getWhiteSpaceLowerCaseAnalyzer()
          This method should not be part of the public API - but design requirements require it to be public.
 void setWhiteSpaceLowerCaseAnalyzer(WhiteSpaceLowerCaseAnalyzer whiteSpaceLowerCaseAnalyzer)
          This method should not be part of the public API - but design requirements require it to be public.
 org.apache.lucene.analysis.TokenStream tokenStream(java.lang.String fieldname, java.io.Reader reader)
           
 
Methods inherited from class org.apache.lucene.analysis.Analyzer
getPositionIncrementGap, getPreviousTokenStream, reusableTokenStream, setPreviousTokenStream
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SnowballAnalyzer

public SnowballAnalyzer()
Create a new SnowballAnalyzer. Uses all defaults in the @see WhiteSpaceLowerCaseAnalyzer, and the default snowball name "English".


SnowballAnalyzer

public SnowballAnalyzer(boolean keepOrigional,
                        java.lang.String snowballName,
                        java.lang.String[] stopWords,
                        char[] charsToRemove,
                        char[] charsToTreatAsWhiteSpace)
Create a Snowball analyzer.

Parameters:
keepOrigional -
snowballName - - Available stemmers are listed in org.tartarus.snowball.ext . The name of a stemmer is the part of the class name before "Stemmer", e.g., the stemmer in EnglishStemmer is named "English".
stopWords - - Stop words to use - not used if null or empty.
charsToRemove - - characters to remove from input (before norm) - not used if null or empty.
charsToTreatAsWhiteSpace - - characters to treat as whitespace (split points) - defaults to typical whitespace if null or empty.

SnowballAnalyzer

public SnowballAnalyzer(boolean keepOrigional,
                        java.lang.String snowballName)
Create a Snowball analyzer.

Parameters:
keepOrigional -
snowballName - - Available stemmers are listed in org.tartarus.snowball.ext . The name of a stemmer is the part of the class name before "Stemmer", e.g., the stemmer in EnglishStemmer is named "English". Uses all defaults in the @see WhiteSpaceLowerCaseAnalyzer.
Method Detail

tokenStream

public final org.apache.lucene.analysis.TokenStream tokenStream(java.lang.String fieldname,
                                                                java.io.Reader reader)
Specified by:
tokenStream in class org.apache.lucene.analysis.Analyzer

getWhiteSpaceLowerCaseAnalyzer

public WhiteSpaceLowerCaseAnalyzer getWhiteSpaceLowerCaseAnalyzer()
This method should not be part of the public API - but design requirements require it to be public. Do not use this method.


setWhiteSpaceLowerCaseAnalyzer

public void setWhiteSpaceLowerCaseAnalyzer(WhiteSpaceLowerCaseAnalyzer whiteSpaceLowerCaseAnalyzer)
This method should not be part of the public API - but design requirements require it to be public. Do not use this method.


Copyright: (c) 2004-2006 Mayo Foundation for Medical Education and Research (MFMER). All rights reserved. MAYO, MAYO CLINIC, and the triple-shield Mayo logo are trademarks and service marks of MFMER.