public class HTMLToString
extends Object
convert()
method that converts an HTML file into an XML string that can be
pre-filtered and added to a Lucene database by the
XMLTextProcessor
class. Modifier and Type | Field and Description |
---|---|
private static HashMap |
htmlCodeMap
Build a HashMap from the code table above
|
(package private) static String[] |
htmlCodes
Table of conversions from HTML ampersand codes to UNICODE.
|
(package private) static Tidy |
tidy
Create the HTMLTidy object that will do the work.
|
Constructor and Description |
---|
HTMLToString() |
Modifier and Type | Method and Description |
---|---|
static String |
convert(InputStream htmlInputStream)
Convert an HTML file into an HTMLTidy style XML string.
|
static String |
replaceHtmlCodes(String in)
Convert any non-XML ampersand codes within a string to their unicode
equivalents.
|
static Tidy tidy
static final String[] htmlCodes
private static HashMap htmlCodeMap
public static String convert(InputStream htmlInputStream)
htmlInputStream
- Stream of HTML text to convert to an XML string.null
.public static String replaceHtmlCodes(String in)
in
- The string within which to convert codes.