org.cdlib.xtf.textIndexer
Class TagFilter

Object
  extended by TokenStream
      extended by TokenFilter
          extended by TagFilter

public class TagFilter
extends TokenFilter

Spots XML elements in a token stream and marks them specially.

Author:
Martin Haye

Field Summary
private  String attrName
          Name of the current attribute
private  int attrNameStart
          Start position of the attribute name
private  String elementName
          Name of the last element we've started
private  Token elementStart
          Start position of insides of element def
private  boolean inAttrName
          True when we're within an attribute name
private  boolean inElement
          True while we're processing inside an element definition
private  boolean inEndTag
          True while we're in an element end tag
private  boolean inQuote
          True when we're within a quoted attribute value
private  int quoteStart
          Position of initial quote mark
private  char[] srcChars
          The source text being tokenized
static Tester tester
          Basic regression test
private  LinkedList tokenQueue
          Queued tokens
static String XML_TYPE
          Type of returned 'element' tokens
 
Fields inherited from class TokenFilter
input
 
Constructor Summary
TagFilter(TokenStream input, String srcText)
          Construct a token stream to mark XML elements.
 
Method Summary
 Token next()
          Retrieve the next token in the stream.
private  Token processNext(Token curToken)
          Does most of the work of processing a token
 
Methods inherited from class TokenFilter
close
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

XML_TYPE

public static final String XML_TYPE
Type of returned 'element' tokens


srcChars

private char[] srcChars
The source text being tokenized


inElement

private boolean inElement
True while we're processing inside an element definition


elementName

private String elementName
Name of the last element we've started


inEndTag

private boolean inEndTag
True while we're in an element end tag


elementStart

private Token elementStart
Start position of insides of element def


inQuote

private boolean inQuote
True when we're within a quoted attribute value


quoteStart

private int quoteStart
Position of initial quote mark


inAttrName

private boolean inAttrName
True when we're within an attribute name


attrNameStart

private int attrNameStart
Start position of the attribute name


attrName

private String attrName
Name of the current attribute


tokenQueue

private LinkedList tokenQueue
Queued tokens


tester

public static final Tester tester
Basic regression test

Constructor Detail

TagFilter

public TagFilter(TokenStream input,
                 String srcText)
Construct a token stream to mark XML elements.

Parameters:
input - Input stream of tokens to process
Method Detail

next

public Token next()
           throws IOException
Retrieve the next token in the stream.

Specified by:
next in class TokenStream
Throws:
IOException

processNext

private Token processNext(Token curToken)
Does most of the work of processing a token

Parameters:
curToken - The token from the input stream
Returns:
Token to return immediately, or null for none. If this is null and curToken was null, should return null immediately.