|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
ObjectLuceneIndexToDict
public class LuceneIndexToDict
Utility class to convert the stored fields of a Lucene index into a spelling
dictionary. This is generally less desirable than integrating dictionary
creation into the original index creation process (e.g. using
SpellWritingAnalyzer
or SpellWritingFilter
) since that will
grab non-stored as well as stored fields. Still, if that isn't an option or
if you simply want to test out spelling correction, after-the-fact dictionary
creation may be useful.
Constructor Summary | |
---|---|
LuceneIndexToDict()
|
Method Summary | |
---|---|
static void |
createDict(Directory indexDir,
File dictDir)
Read a Lucene index and make a spelling dictionary from it. |
static void |
createDict(Directory indexDir,
File dictDir,
ProgressTracker prog)
Read a Lucene index and make a spelling dictionary from it. |
static void |
createDict(IndexReader indexReader,
Analyzer analyzer,
SpellWriter spellWriter,
ProgressTracker prog)
Read a Lucene index and make a spelling dictionary from it. |
static void |
main(String[] args)
Command-line interface for build a dictionary directly from a Lucene index without writing any code. |
static void |
queueWords(IndexReader reader,
Analyzer analyzer,
SpellWriter writer,
ProgressTracker prog)
Re-tokenize all the words in stored fields within a Lucene index, and queue them to a spelling dictionary. |
Methods inherited from class Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public LuceneIndexToDict()
Method Detail |
---|
public static void createDict(Directory indexDir, File dictDir) throws IOException
StopAnalyzer.ENGLISH_STOP_WORDS
).
indexDir
- directory containing the Lucene indexdictDir
- directory to receive the spelling dictionary
IOException
public static void createDict(Directory indexDir, File dictDir, ProgressTracker prog) throws IOException
StopAnalyzer.ENGLISH_STOP_WORDS
).
indexDir
- directory containing the Lucene indexdictDir
- directory to receive the spelling dictionaryprog
- tracker called periodically to display progress
IOException
public static void createDict(IndexReader indexReader, Analyzer analyzer, SpellWriter spellWriter, ProgressTracker prog) throws IOException
StopAnalyzer.ENGLISH_STOP_WORDS
).
indexReader
- used to read fields from a Lucene indexanalyzer
- used to tokenize fields from the index; generally,
this should do minimal filtering, taking care to avoid substantive
token modification (such as stemming or depluralization). A good
choice is MinimalAnalyzer
.spellWriter
- receives words to be added to the dictionaryprog
- tracker called periodically to display progress
IOException
public static void queueWords(IndexReader reader, Analyzer analyzer, SpellWriter writer, ProgressTracker prog) throws IOException
reader
- used to read fields from a Lucene indexanalyzer
- used to tokenize fields from the index; generally,
this should do minimal filtering, taking care to avoid substantive
token modification (such as stemming or depluralization). A good
choice is MinimalAnalyzer
.writer
- receives words to be added to the dictionaryprog
- tracker called periodically to display progress
IOException
public static void main(String[] args)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |