|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
ObjectContextMarker
public class ContextMarker
Workhorse class that handles marking hits, context surrounding hits, and search terms.
Created: Dec 26, 2004
Field Summary | |
---|---|
private MarkCollector |
collector
Client instance which receives the resulting marks |
private WordIter |
iter0
Iterator used for locating the start of the hit/context |
private WordIter |
iter1
Iterator used for locating the end of the hit/context |
static int |
MARK_ALL_TERMS
See MARK_NO_TERMS |
static int |
MARK_CONTEXT_TERMS
See MARK_NO_TERMS |
static int |
MARK_NO_TERMS
The following modes can be used for term marking: MARK_NO_TERMS: Terms are not marked MARK_SPAN_TERMS: Search terms are marked only within span hits. |
static int |
MARK_SPAN_TERMS
See MARK_NO_TERMS |
private int |
maxContext
Target size (in chars) of the context surrounding each hit |
private int |
prevEndWord
End of the previous context |
private Set |
stopSet
Set of stop-words to avoid marking outside of hits |
private int |
termMode
Whether to mark terms inside/outside hits, context, etc. |
private Set |
terms
Set of search terms to mark |
private int |
termsMarkedPos
Word position up to which we've marked all terms |
private MarkPos |
tmpPos
Used to temporary position storage |
Constructor Summary | |
---|---|
ContextMarker(int maxContext,
int termMode,
Set terms,
Set stopSet,
WordIter wordIter,
MarkCollector collector)
Construct a new marker |
Method Summary | |
---|---|
(package private) void |
emitMarks(Span posSpan,
MarkPos contextStart,
MarkPos contextEnd)
Emit all the marks for the given hit. |
(package private) void |
findContext(Span posSpan,
Span nextSpan,
MarkPos contextStart,
MarkPos contextEnd)
Locate the start and end of context for the given hit. |
void |
mark(Span[] posOrderSpans,
int maxContext)
Mark a series of spans. |
static void |
markField(FieldSpans fieldSpans,
String field,
WordIter iter,
int maxContext,
int termMode,
Set stopSet,
MarkCollector collector)
Mark context, spans, and terms a field of data. |
void |
markField(String field,
FieldSpans fieldSpans,
MarkCollector collector)
Mark context, spans, and terms within the given field of this document. |
void |
markField(String field,
FieldSpans fieldSpans,
WordIter iter,
int maxContext,
int termMode,
Set stopSet,
MarkCollector collector)
Mark context, spans, and terms within the given field of this document. |
private void |
markTerms(WordIter iter,
int fromPos,
int toPos,
boolean markStopWords)
Mark terms up to (but not including) 'wordPos' |
Methods inherited from class Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int MARK_NO_TERMS
MARK_NO_TERMS: Terms are not marked
MARK_SPAN_TERMS: Search terms are marked only within span hits.
MARK_CONTEXT_TERMS: Search terms are marked within span hits and, if found, within the context surrounding those hits.
MARK_ALL_TERMS: Search terms are marked wherever they are found.
public static final int MARK_SPAN_TERMS
MARK_NO_TERMS
public static final int MARK_CONTEXT_TERMS
MARK_NO_TERMS
public static final int MARK_ALL_TERMS
MARK_NO_TERMS
private int maxContext
private WordIter iter0
private WordIter iter1
private MarkCollector collector
private Set terms
private Set stopSet
private int termMode
MARK_SPAN_TERMS
, etc.
private int termsMarkedPos
private MarkPos tmpPos
private int prevEndWord
Constructor Detail |
---|
public ContextMarker(int maxContext, int termMode, Set terms, Set stopSet, WordIter wordIter, MarkCollector collector)
Method Detail |
---|
public void markField(String field, FieldSpans fieldSpans, MarkCollector collector)
field
- field name to markfieldSpans
- spans to mark withcollector
- collector to receive the markspublic void markField(String field, FieldSpans fieldSpans, WordIter iter, int maxContext, int termMode, Set stopSet, MarkCollector collector)
field
- field name to markiter
- iterator over the words in the fieldmaxContext
- target number of characters for context around
each hit (including the text of the hit itself.)
80 is often a good choice. Specify zero to turn off
context marking.termMode
- what areas to mark hits - see MARK_NO_TERMS
.stopSet
- set of stop words to avoid marking outside hitscollector
- collector to receive the markspublic static void markField(FieldSpans fieldSpans, String field, WordIter iter, int maxContext, int termMode, Set stopSet, MarkCollector collector)
field
- field name to markiter
- iterator over the words in the fieldmaxContext
- target number of characters for context around
each hit (including the text of the hit itself.)
80 is often a good choice. Specify zero to turn off
context marking.termMode
- what areas to mark hits - see MARK_NO_TERMS
.stopSet
- set of stop words to avoid marking outside hitscollector
- collector to receive the markspublic void mark(Span[] posOrderSpans, int maxContext)
posOrderSpans
- Spans to mark, in ascending position order.maxContext
- Target # of chars for context around hits
(0 for none)void findContext(Span posSpan, Span nextSpan, MarkPos contextStart, MarkPos contextEnd)
posSpan
- hit for which to find contextnextSpan
- following hit (or null if none)contextStart
- OUT: start of contextcontextEnd
- OUT: end of contextvoid emitMarks(Span posSpan, MarkPos contextStart, MarkPos contextEnd)
posSpan
- hit for which to emit markscontextStart
- start of context (or null if context disabled)contextEnd
- end of context (or null if context disabled)private void markTerms(WordIter iter, int fromPos, int toPos, boolean markStopWords)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |