[ You are here:
XTF ->
Change Log -> Version 2.1 ]
Version 2.1 Changes
While the last version focused on getting the documentation in order, this is a major new feature release with special focus on the default stylesheets and user interface.
Highlights
- Extensive UI improvements, including new search forms, built-in faceted browsing, and a crisp new look.
- Built-in XHTML and OAI/PMH output, NLM article in/out, and Microsoft Word in.
- For those starting fresh with XTF the stylesheets are now easier to understand and adapt. For existing implementations, 2.1 is compatible with all prior XTF stylesheets and config files.
- Experimental "freeform" boolean query language.
- Many bug fixes and minor changes/features.
Features
- Revised default stylesheets:
- Faceted browsing and query drill-down
- Improved look and feel
- Simple and Advanced search
- Browse by Author or Title
- Improved bookbag
- Support for searching and navigating NLM-formatted XML articles
- Support for searching most Microsoft Word documents [Feature 1729731]
- docSelector.xsl uses ReadXMLStub extension (below) to decide how to index XML docs, rather than relying on file/directory name.
- Now uses web-standard YUI library for AJAX instead of custom Javascript code.
- Basic OAI/PMH support now included.
- Support for efficiently handling robot crawling (e.g. Googlebot)
- Stylesheet code has been streamlined.
- New extension functions to:
- Send email from a stylesheet.
- Quickly read in the first part of an XML file.
- Check if cookies are enabled.
- Read and "tidy" HTML documents into XHTML documents (helpful for screen scraping and similar activities)
- A new query type is supported: <allDocs>. This is mainly for convenience in browsing the entire collection (usually used with facets)
- Ability to index most Microsoft Word documents [Feature 1729731]
- New "field:emptyFirst" and "field:emptyLast" doc sort modifiers; emptyLast is default instead of inscrutable behavior.
- Also support "field:ascending" and "field:descending" sort modifiers, as a more verbose form of "+field" and "-field".
- New facet sorting option: reverseValue (useful for date field).
- New facet selection operator: singleton (useful for auto-expanding single selections in a hierarchical facet).
Changes
- Expanded set of sample documents.
- Session tracking is now enabled by default. Cookie-less mode is deprecated.
- Improved error reporting includes file name and line number for all XSLT errors.
- Changed to avoid loading DTD associated with documents. This can greatly boost speed, and also makes the system more reliable as it no longer depends on external web servers.
- Upgraded to new version of PDFBox which is faster and less buggy.
- There was previously no way to select facet values containing parentheses, brackets, asterisks, and other special characters. You can now use quotes to disambiguate these in a facet select expression. [Bug 1891510]
- You can now explicitly specify 'score' (or synonym 'relevance') in sortDocsBy on meta-data fields. Score sorting has always been the default, but this enables one to make it obvious. [Bug 1307014]
Bug fixes
- Fixed occasional NullPointer exception in crossQuery when 'raw' mode was used.
- Fixed other exception handling problems. Thanks to Jakob Saternus for finding these and submitting patches.
- Fixed rare NullPointerException when trying to scan attributes of a lazy document.
- Fixed bug in facet sorting: would sometimes drop single subgroup and all its descendants.
- Fixed HTML text extraction to remove or map illegal characters.
- Fixed bug in path processing: XTF would often erroneously remove "../.." from a path name.
- Fixed bug that caused an exception if non-XTF attributes were placed on an untokenized meta-data element. Thanks to Richard Padley for reporting this.
- Fixed potential out-of-memory situation if multiple threads simultaneously load facet data. [Bug 1503250]
- Fixed NullPointer exception when using wildcards in a <near>...<not> query. Thanks to Seth Cherney for reporting this. [Bug 1891455]
- XTF would fail mysteriously if certain names were used for meta-data fields, such as "text" and "key". A descriptive error is now reported instead. [Bug 1291495]
- In high-volume situations when a new or changed index was loaded, it was possible for multiple threads to load the same facet data at the same time, wasting time and possibly causing an out-of-memory situation. These are now synchronized to be in strict sequence instead of parallel. [Bug 1503250]
- A stack trace was being reported for TermLimit exceptions, but isn't needed. [Bug 1717488]
- Fixed bug: sometimes Saxon on Windows would report include "%20" in a system-id() path instead of space, which caused FIleUtils.exists() to fail. FileUtils now handles this case. [Bug 1814140]
- Fixed handling of multiple values for the same parameter name in the URL. XTF servlets now pass these correctly to stylesheets (which can deal correctly with them, or not, as they wish.) [Bug 1894034]
- Fixed assertion failure when a single stop word is specified as a keyword query. [Bug 1988961]
- Fixed bug: single-term, single-field keyword queries were failing mysteriously. [Bug 1903441]
- Fixed handling of drive-letter paths on Windows under Java 6 that caused the textIndexer to crash on start-up. [Bug 1936979]
- Corrected handling of <xsl:result-document> in servlets, so that it can change the output format during the transformation. Needed for XHTML/Frameset output. [Bug 1934874]
- Took out dynaXML caching of docSelector output. This cache didn't actually enhance performance, and caused tricky multi-threading bugs to occur.
- Removed obsolete DirectSearch and PreviewXML servlets (these were never used in XTF)
- Fixed crossQuery problems relating to query terms containing ampersands. Thanks to Richard Padley for submitting these. [Bugs 1951931 and 1951933]
Experimental additions
- "Freeform" query language which allows users to type in fielded boolean queries similar to those supported by Google's advanced search.
- RawQuery servlet, which receives a single XML query in the URL, runs it, and returns XML results.
- Extension to read a PNG file, extract a piece of it, add yellow highlights, and send image over HTTP. Eventually may be used to implement image page flipping with hits in context (e.g. PDFs, scanned texts, etc.)
- Support for alternate URL parameter tokenizers in crossQuery and dynaXML.