edu.mayo.informatics.indexer.lucene
Class IDFNeutralSimilarity

java.lang.Object
  extended by org.apache.lucene.search.Similarity
      extended by org.apache.lucene.search.DefaultSimilarity
          extended by edu.mayo.informatics.indexer.lucene.IDFNeutralSimilarity
All Implemented Interfaces:
java.io.Serializable

public class IDFNeutralSimilarity
extends org.apache.lucene.search.DefaultSimilarity

This class overrides the IDF scoring portion of the Lucene scoring algorithm. See method description for details.

Author:
Dan Armbrust
See Also:
Serialized Form

Constructor Summary
IDFNeutralSimilarity()
           
 
Method Summary
 float idf(int arg0, int arg1)
          I'm returning a constant for idf instead of the way it used to be calculated because if I had lots of documents with an other_designation of "renal calculus" but only a couple of documents that contained a preferred designation of "renal calculus" somewhere in the desgination - the longer preferred_designation's were being scored higher than the exact match other_designation.
 
Methods inherited from class org.apache.lucene.search.DefaultSimilarity
coord, lengthNorm, queryNorm, sloppyFreq, tf
 
Methods inherited from class org.apache.lucene.search.Similarity
decodeNorm, encodeNorm, getDefault, getNormDecoder, idf, idf, scorePayload, setDefault, tf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

IDFNeutralSimilarity

public IDFNeutralSimilarity()
Method Detail

idf

public float idf(int arg0,
                 int arg1)
I'm returning a constant for idf instead of the way it used to be calculated because if I had lots of documents with an other_designation of "renal calculus" but only a couple of documents that contained a preferred designation of "renal calculus" somewhere in the desgination - the longer preferred_designation's were being scored higher than the exact match other_designation. In other words, by returing a constant, I should be removing the inverse document frequency from the score calculation. IDF gave weight to a term based on how often the term appeared in the index.

Overrides:
idf in class org.apache.lucene.search.DefaultSimilarity

Copyright: (c) 2004-2006 Mayo Foundation for Medical Education and Research (MFMER). All rights reserved. MAYO, MAYO CLINIC, and the triple-shield Mayo logo are trademarks and service marks of MFMER.