|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
ObjectHTMLToString
public class HTMLToString
This class provides a single static convert()
method that converts an HTML file into an XML string that can be
pre-filtered and added to a Lucene database by the
XMLTextProcessor
class.
Internally, the HTML to XML file conversion is performed by the jTidy
library, which is a variant of the HTMLTidy converter.
Field Summary | |
---|---|
private static HashMap |
htmlCodeMap
Build a HashMap from the code table above |
(package private) static String[] |
htmlCodes
Table of conversions from HTML ampersand codes to UNICODE. |
(package private) static Tidy |
tidy
Create the HTMLTidy object that will do the work. |
Constructor Summary | |
---|---|
HTMLToString()
|
Method Summary | |
---|---|
static String |
convert(InputStream htmlInputStream)
Convert an HTML file into an HTMLTidy style XML string. |
static String |
replaceHtmlCodes(String in)
Convert any non-XML ampersand codes within a string to their unicode equivalents. |
Methods inherited from class Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
static Tidy tidy
static final String[] htmlCodes
private static HashMap htmlCodeMap
Constructor Detail |
---|
public HTMLToString()
Method Detail |
---|
public static String convert(InputStream htmlInputStream)
htmlInputStream
- Stream of HTML text to convert to an XML string.
null
.public static String replaceHtmlCodes(String in)
in
- The string within which to convert codes.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |