|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
ObjectSectionInfoStack
public class SectionInfoStack
This class maintains information about the current nesting of sections
in a text document that the TextIndexer program is processing.
On-line documents are stored as "nodes" in XML files that contain information
about the document, and the document text itself. The nodes usually form
a heirarchical tree structure, with the outer-most nodes recording
various bits of information about the text within. Inside the outer nodes
are additional nodes that record the organization of the text itself,
including things like section, chapter, and paragraph information. To the
text indexer program and search engine, sections have special significance.
Text in two adjacent sections that have different names, are considered
to not be "near" one another, so that proximity searches will not produce
results that span across two or more sections.
Since sections can be nested inside one-another, a stack of the current
nesting level needs to be maintained by the text indexer when a document
is being processed. Doing so does two things:
- It allows unnamed inner sections to inherit properties from the parent
sections that contain them.
- When the end of an named section has been reached, the text indexer can
return to using the parent section's properties and continue processing.
The SectionInfoStack
class is used to maintain the current
state of nested sections encountered in a document by the text indexer,
while the SectionInfo
class holds
the section attributes for each entry in the current stack.
Field Summary | |
---|---|
private LinkedList |
defaultMetaInfo
Top-level list of meta-data |
private Stack |
infoStack
Actual generic stack that holds the SectionInfo objects. |
Constructor Summary | |
---|---|
SectionInfoStack()
|
Method Summary | |
---|---|
int |
depth()
Return the current depth of the top section on the nesting stack. |
int |
indexFlag()
Return the index flag for the top section on the nesting stack. |
boolean |
isEmpty()
Query method to determine if there are any nested sections currently on the nesting stack. |
LinkedList |
metaInfo()
|
SectionInfo |
peek()
Return a reference to the section currently at the top of the nesting without popping the stack. |
void |
pop()
Section de-stacking operator. |
SectionInfo |
prev()
Return a reference to the section just below the current one. |
void |
push()
Implicit depth-push operator. |
void |
push(int indexFlag,
String sectionType,
int sectionBump,
float wordBoost,
int sentenceBump,
int spellFlag,
String subDocument,
LinkedList metaInfo)
Explicit section push operator. |
private void |
push(SectionInfo info)
Push a SectionInfo instance onto the
top of the section stack. |
int |
sectionBump()
Return the section bump value for the top section on the nesting stack. |
String |
sectionType()
Return the section type name for the top section on the nesting stack. |
int |
sentenceBump()
Return the sentence bump value for the top entry in the stack. |
int |
setSectionBump(int newBump)
This function sets the section bump value for the top entry in the stack. |
int |
spellFlag()
Return the spell flag for the top section on the nesting stack. |
String |
subDocument()
Return the subdocument name for the top of the nesting stack. |
private SectionInfo |
top()
Return a reference to the top entry in the section stack, if any. |
int |
useSectionBump()
Use and clear the section bump value for the top section on the nesting stack. |
boolean |
valuesChanged(int indexFlag,
String sectionType,
int sectionBump,
float wordBoost,
int sentenceBump,
int spellFlag,
String subDocument,
LinkedList metaInfo)
Query method to determine if the passed set of section attributes differs from the section at the top of the nesting stack. |
float |
wordBoost()
Return the word boost value for the top entry in the stack. |
Methods inherited from class Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private Stack infoStack
SectionInfo
objects.
private LinkedList defaultMetaInfo
Constructor Detail |
---|
public SectionInfoStack()
Method Detail |
---|
public void push(int indexFlag, String sectionType, int sectionBump, float wordBoost, int sentenceBump, int spellFlag, String subDocument, LinkedList metaInfo)
indexFlag
- A flag indicating whether or not the current section
should be indexed. Valid values are
parentIndex
,
index
,
noIndex
.
sectionType
- The type name for the section being pushed. This may
either a caller defined string or an empty string
(""). Note that if an empty string is passed, the
section name is inherited from the parent section
(if defined.) sectionBump
- The offset (in words) of the current section from
the previous section. Used to lower the relevance of
(or completely avoid) proximity matches that
span two sections. This value is typically set to
zero (for no de-emphasis of proximity matches
across adjacent sections), or a value greater than
or equal to the chunk overlap used by the index (to
completely avoid proximity matches across adjacent
sections.) wordBoost
- Boost factor to apply to words in this section.
values greater than 1.0 make the words found in this
section more relevant in a search, while values less
than 1.0 make words in the section less relevant.
sentenceBump
- The offset (in words) for this section between the
start of a new sentence and the end of the previous
one. Like the section bump, this value is used to
adjust the relevance of proximity matches made
across sentence boundaries. Typical values are
one (for no de-emphasis of proximity matches across
sentence boundaries), a value between one and the
chunk overlap for the index (for partial de-emphasis
of proximity matches across sentence boundaries), or
a value greater than or equal to the chunk size to
completely avoid proximity matches across sentence
boundaries.) spellFlag
- A flag indicating whether or not words in the current
section should be added to the spelling correction
dictionary. Valid values are
parentSpell
,
spell
,
noSpell
.
subDocument
- A name of the subdocument being pushed. A subdocument
is part of a document that is to be represented as a
single searchable unit in search results, but should be
viewed in the context of its larger document. A subdoc
can have its own meta-data. If the empty string ""
is passed, the subdocument is unchanged.metaInfo
- List of meta-data for the subdocument being pushed.
If null, the parent meta-data list will be used.depth-push
method is called to save space. Otherwise, the new section entry
with the passed attributes is created and placed on the stack.
SectionInfo
class. public void push()
depth
field of the
SectionInfo
class to maintain the
correct depth for nested sections with identical attributes while avoiding
pushing entire duplicate entries.
public void pop()
public SectionInfo peek()
public SectionInfo prev()
public boolean isEmpty()
true
- No nested sections currently on the stack.
false
- One or more nested sections are currently
on the stack. public boolean valuesChanged(int indexFlag, String sectionType, int sectionBump, float wordBoost, int sentenceBump, int spellFlag, String subDocument, LinkedList metaInfo)
true
- One or more of the passed attributes do not
match the attributes for the section
currently at the top of the stack.
false
- The passed attributes are identical to those
for the section currently at the top of the
stack. true
. public int depth()
-1
if the stack is empty.public int indexFlag()
index
or
noIndex
.
parentIndex
.
That value is only used as an argument when calling the
explicit section-push
operator to force the new section to adopt it's parents index
flag.indexFlag
attribute, see
the indexFlag
field in the SectionInfo
class. public int spellFlag()
spell
or
noSpell
.
parentSpell
.
That value is only used as an argument when calling the
explicit section-push
operator to force the new section to adopt it's parents spell
flag.spellFlag
attribute, see
the spellFlag
field in the SectionInfo
class. public String subDocument()
public LinkedList metaInfo()
public String sectionType()
sectionType
attribute, see
the sectionType
field in the SectionInfo
class. public int sectionBump()
defaultSectionBump
value if the stack is empty. sectionBump
attribute, see
the sectionBump
field in the SectionInfo
class. public int useSectionBump()
defaultSectionBump
value if the stack is empty. sectionBump
attribute, see
the sectionBump
field in the SectionInfo
class. public int setSectionBump(int newBump)
newBump
- New bump value to set for top entry. sectionBump
attribute, see
the sectionBump
field in the SectionInfo
class. public float wordBoost()
SectionInfo.defaultWordBoost
.
Otherwise, it returns the word boost for the section currently
at the top of the stack. wordBoost
attribute, see
the wordBoost
field in the SectionInfo
class. public int sentenceBump()
SectionInfo.defaultSentenceBump
.
Otherwise, it returns the sentence bump for the section
currently at the top of the stack. sentenceBump
attribute, see
the sentenceBump
field in the SectionInfo
class. private void push(SectionInfo info)
SectionInfo
instance onto the
top of the section stack.
SectionInfo
instance.private SectionInfo top()
null
if the stack is empty.SectionInfo
instance.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |