org.cdlib.xtf.textIndexer
Class HTMLIndexSource

Object
  extended by IndexSource
      extended by XMLIndexSource
          extended by HTMLIndexSource

public class HTMLIndexSource
extends XMLIndexSource

Transforms an HTML file to a single-record XML file.

Author:
Martin Haye

Field Summary
private  File htmlFile
          Source of HTML data
 
Constructor Summary
HTMLIndexSource(File htmlFile, String key, Templates[] preFilters, Templates displayStyle, StructuredStore lazyStore)
          Constructor -- initializes all the fields
 
Method Summary
protected  InputSource filterInput()
          Transform the HTML file to XML data
 
Methods inherited from class XMLIndexSource
displayStyle, key, nextRecord, normalize, path, preFilters, removeDoctypeDecl, totalSize
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

htmlFile

private File htmlFile
Source of HTML data

Constructor Detail

HTMLIndexSource

public HTMLIndexSource(File htmlFile,
                       String key,
                       Templates[] preFilters,
                       Templates displayStyle,
                       StructuredStore lazyStore)
Constructor -- initializes all the fields

Method Detail

filterInput

protected InputSource filterInput()
                           throws IOException
Transform the HTML file to XML data

Overrides:
filterInput in class XMLIndexSource
Throws:
IOException