<directory dirPath="DirectoryPath"> File File … File </directory>where
dirPath="!DirectoryPath" | is the absolute file path to the directory on disk. |
File, File... | is zero or more File Tags (see below) , one for each file found in the directory. |
<file fileName="FileName"/>where
fileName="!FileName" | is the name of a file found in the directory identified by the containing <directory...> tag. Note that this file name does not contain any path information for the file, but only the file name itself. |
<indexFiles> FileToIndex FileToIndex … FileToIndex </indexFiles>where
FileToIndex, FileToIndex... | is zero or more File To Index Tags (see below), one for each file to index in the directory. |
<file fileName = "FileName" {format = "FileFormat"} {preFilter = "PreFilterPath(s)"} {displayStyle = "DocumentFormatterPath"}/>where
fileName="FileName" | is a required attribute that specifies the name of a file to be indexed. Note that this file name should not contain any path information for the file, but only the file name itself. |
format="FileFormat" | is an optional attribute that specifies the format of a file to be indexed. Currently XML, PDF, HTML plain text, and most Microsoft Word files are handled by the textIndexer, and the format attribute should correspondingly be set to XML, PDF, HTML, Text, or MSWord. If this attribute is omitted, the textIndexer will try to infer the file type based on the file extension. |
preFilter="PreFilterPath(s)" | is an optional attribute that specifies the path to the Pre-Filter stylesheet to be applied to this file. If this path is not specified as an absolute path, it is assumed to be relative to the XTF base installation directory (i.e., XTF_HOME.) Multiple pre-filters may be specified in a list; they should be separated by ";" or "," characters. The pre-filters will be applied in the order listed (e.g. the original file is sent to the first pre-filter; its output is sent to the second pre-filter, whose output is sent to the third, etc.) If this attribute is omitted, no pre-filter will be applied to the file. |
displayStyle="DocumentFormatterPath" | is an optional attribute that specifies path to the Document Formatter stylesheet to use for this file. If this path is not specified as an absolute path, it is assumed to be relative to the XTF base installation directory (i.e., XTF_HOME.) If this attribute is present, the textIndexer will create a special cache that is used by the dynaXML servlet to display the current file more quickly. If this attribute is omitted, the cache is not created. For more details, see the discussion of Lazy Document Handling in the XTF Under the Hood guide. |
<xsl:attribute name="xtf:index" select="'TrueOrFalse'"/> <xsl:attribute name="xtf:noindex" select="'TrueOrFalse'"/>This attribute is used to turn on/off indexing for a tag in a source document. The noindex variant is simply a logical inverse of the index variant. Both are provided as a convenience to the programmer. The value for either of these tags should be set to either the string 'true' or the string 'false'. (Note: If not explicitly set, nested sub-tags for a document inherit the index/noindex state from the closest parent tag for which an index state is defined.) This attribute can be used for normal text blocks, and also on blocks marked as metadata using the xtf:meta attribute below. In both cases, it controls whether the given block of text, or meta-data field, is added to the index. In the case of meta-data, a field that isn't added to the index will still be made available to the Result Formatter stylesheet when crossQuery results are displayed.
<xsl:attribute name="xtf:meta" select="'TrueOrFalse'"/>This attribute is used to mark the contents of a tag as being part of the meta-data for a document rather than main-body text for the document. The select value for this tag should be set to either the string 'true' (text in tag is meta data) or the string 'false' (text in tag is not meta data.) The entire tag and its contents will be treated as meta-data and will be added to the index using the element name as the name of the meta-data field. That is, the tag will be indexed separately from the full text of the document. Other attributes of the tag, and any embedded element tags, will be stored in the index and will be passed verbatim to the Result Formatter stylesheet to be used for output purposes. Of course the text of the element and any sub-elements will be searchable, but the actual attributes and element tags themselves cannot be searched for. Note: If you mark a section of text with the xtf:meta attribute, it will not be included in the full text index of that document (accessed by querying the text field). If you want a given piece of text to appear in both the meta-data and full-text indexes, make two copies of it, marking one with xtf:meta and not marking the other.
<xsl:attribute name="xtf:store" select="'TrueOrFalse'"/>This attribute is used to turn on/off whether to store the contents of a meta-data field in the index, and make them available to the Result Formatter stylesheet. The value for either of these tags should be set to either the string 'true' or the string 'false'. If not specified, this attribute defaults to 'true'. This attribute can only be used on meta-data blocks that also have the xtf:meta attribute set. Setting xtf:store to 'false' can make the final index smaller, and can also speed up processing by the Result Formatter stylesheet, since it will have less data to process. A field can be indexed and stored, indexed and not stored, or stored and not indexed; all of these combinations can be useful in certain circumstances.
<xsl:attribute name="xtf:tokenize" select="'YesOrNo'"/>This attribute is used to indicate whether a meta-data field should be tokenized or not. By default, meta-data fields are tokenized so they can be searched. If you intend to use a meta-data field for sorting query results instead, set this attribute to 'no' .
<xsl:attribute name="xtf:proximitybreak" select="'TrueOrFalse'"/>This attribute introduces a proximity break into a document. A tag marked with a proximity break attribute is considered to be infinitely far away from the previous or containing tag. Using this tag prevents proximity matches that span two adjacent tags from being counted as a valid match. The select value for this tag should be set to either the string 'true' (introduce a proximity break) or the string 'false' (do not introduce a proximity break.) To de-emphasize rather than disallow proximity matches across sections, use the sectionBump attribute instead (see below).
<xsl:attribute name="xtf:sentenceBump" select="BumpInWords"/>This attribute de-emphasizes proximity searches that span multiple sentences by introducing extra virtual spacing between adjacent sentences. The amount of virtual spacing to add between the end of the previous sentence and the beginning of the current one is specified as a number of virtual words by the BumpInWords argument. This value, if not specified, defaults to five words of added spacing. (Note: If not explicitly set, nested sub-tags for a document inherit the sentence bump value from the closest parent tag for which a sentence bump value is defined.)
Section Type Attribute
<xsl:attribute name="xtf:sectionType" select="'TypeName'"/>This attribute assigns a section type name to a tag, with the TypeName parameter identifying the section name to use. Assigning a section name to a tag allows grouped searches to be performed on tags that have the same section names, by inserting a Section Type Tag into a query. (Note: If not explicitly set, nested sub-tags for a document inherit the section type name from the closest parent tag for which a section name is defined.)
<xsl:attribute name="xtf:sectionTypeAdd" select="'TypeName'"/>This attribute appends a section type name to the section type already associated with a tag (or one of its ancestors which has a section type), with the TypeName parameter identifying the section name to append. Assigning a section name to a tag allows grouped searches to be performed on tags that have the same section names, by inserting a Section Type Tag into a query. And appending a section type allows child tags to inherit their parent's sectionType and then add additional type information. This can be very useful for representing hierarchical information using section types. (Note: If not explicitly set, nested sub-tags for a document inherit the section type name from the closest parent tag for which a section name is defined, including any section type which has been appended to that parent tag.)
<xsl:attribute name="xtf:sectionBump" select="BumpInWords"/>This attribute de-emphasizes proximity searches that span multiple sections by introducing extra virtual spacing between adjacent sections. The amount of virtual spacing added between the end of the previous section and the beginning of the current one is specified as a number of virtual words by the BumpInWords argument. This value, if not specified, defaults to zero words of added spacing.
<xsl:attribute name="xtf:wordBoost" select="BoostValue"/>This attribute boosts or de-emphasizes the relevance of text found within a particular tag. To boost the relevance of text in a tag, set the BoostValue parameter to a floating-point number greater than 1.0. To de-emphasis the relevance of a tag's text, set the BoostValue parameter to a floating-point number between 0.0 and 1.0. (Note: If not explicitly set, nested sub-tags for a document inherit the boost value from the closest parent tag for which a boost value is defined.)
<xsl:param name="http.URL"/>This parameter contains a URL string of the form: http://yourserver/yourport/servlet/queryparms where
yourserver | is the name of your XTF server |
yourport | is the port through which XTF requests are routed (typically 8080) |
servlet | is the name of the servlet to which the request is being sent. Normally, this is either search (for crossQuery) or view (for dynaXML) |
queryparms | is the list of parameters that defines the actual request being sent to the servlet. All URL escape codes (such as %20 for space) will have been translated to normal characters, and UTF-8 octet sequences will have been decoded. |
This field identifies the full URL passed to the XTF system for the current request, with all of the percent codes left unescaped. The request URL is always available to servlets (unlike many of the following parameters which are optional). It is accessed via the XSL parameter
<xsl:param name="http.rawURL"/>This parameter contains a URL string of the form: http://yourserver/yourport/servlet/queryparms where
yourserver | is the name of your XTF server |
yourport | is the port through which XTF requests are routed (typically 8080) |
servlet | is the name of the servlet to which the request is being sent. Normally, this is either search (for crossQuery) or view (for dynaXML) |
queryparms | is the list of parameters that defines the actual request being sent to the servlet. All URL escape codes (such as %20 for space) will be left unescaped (i.e. not translated to normal characters) and UTF-8 octet sequences will not have been decoded. |
<xsl:param name="servlet.dir"/>Typically this value comes from the servlet container (e.g. Resin or Tomcat), but may be overridden by specifying a base-dir parameter in the servlet container configuration. As this varies by container, check the documentation for your servlet container if you wish to override this value.
<xsl:param name="servlet.URL"/>This parameter contains a URL string of the form: http://yourserver{:yourport}/xtf/servlet where
yourserver | is the name of your XTF server |
yourport | is the port through which XTF requests are routed (typically 8080) |
servlet | is the name of the servlet to which the request is being sent. Normally, this is either search (for crossQuery) or view (for dynaXML) |
<xsl:param name="root.URL"/>This parameter contains a URL string of the form: http://yourserver{:yourport}/xtf/ where
yourserver | is the name of your XTF server |
yourport | is the port through which XTF requests are routed (typically 8080) |
<xsl:param name="http.user-agent"/>This parameter contains a string that identifies the browser that made the current request. Most HTTP requests will provide this field. For example:
Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt; empas)Note that the contents of this field vary widely depending on which browser made the request, and a detailed description is beyond the scope of this document.
<xsl:param name="http.referer">This parameter holds a URL string identifying the web page that issued the request. Often the initial request to the servlet will supply this parameter; subsequent requests will not.
<xsl:param name="http.if-modified-since"/>Most HTTP responses will not include this parameter, but if they do, the contents of this time string of the form: weekday, dd-mmm-yy hh:mm:ss timezone where
weekday | is the day of the week the request was last issued |
dd-mmm-yy | is the day, three letter month abbreviation, and year the request was last issued |
hh:mm:ss | is the time the request was last issued, represented as a 24 hour GMT based time |
timezone | is the offset in hours of the timezone from which the request was last issued. |
<xsl:param name="http.field-name">where field-name is the name of the HTTP header field. Note that most HTTP header fields will be absent most of the time.
<xsl:param name="ElementName.AttributeName/>For a concrete example, see the Programming Guide.
<parameters> ParameterBlock ParameterBlock … </parameters>
<param name="ParamName" value="ParamValue"> Token | Phrase Token | Phrase … </param>where
name="ParamName" | is the name of the parameter extracted from the original query URL. |
value="ParamValue" | is the original text in the query URL that is assigned to the specified parameter. |
<xsl:param name="ParamName" select="DefaultValueIfNotInURL"/>This allows query parameters to be accessed either through the standard template driven XML or through stylesheet parameters.
<token value="Word" isWord="YesOrNo"/>where
value="Word" | is the actual word or symbol extracted from the URL. |
isWord="YesOrNo" | identifies whether the token is a word (isWord="yes") or a punctuation symbol (isWord="no".) |
<phrase value="StringOfWords"> Token Token … </phrase>where
value="StringOfWords" | is the entire phrase extracted from the URL as a single string. |
Token, Token... | is the original phrase broken down into one or more Token Tags (see above), one for each word or symbol in the phrase.) |
$exception | A string containing the name of the exception that occurred. This will be the name of one of the error/exception tags listed below (e.g., ExcessiveWork, TermLimit, etc.) |
$message | The descriptive message for the error/exception (if any; may be an empty string.) |
$stackTrace | The HTML-Formatted Java Stack Trace generated by the exception, (if any; may be an empty string.) |
<QueryFormat> <message>Error Message</message> </QueryFormat>To generate such an error, the Query Parser simply issues a
<error message="Error Message"/>tag instead of a query tag. The error message specified by the parser's error tag is then transferred to the Query Format Error Tag, for processing by the servlet's Error Generator stylesheet.
<TermLimit> <message>Error Message</message> </TermLimit>This error is most often generated by the expansion of wildcard and range queries.
<ExcessiveWork> <message>Error Message</message> </ExcessiveWork>
<InvalidDocument> <message>Error Message</message> <docId>Document Identifier</message> </InvalidDocument>Usually, this error is generated when a request is made for a document that doesn't exist (for instance, it might have been removed, or the document ID might be invalid.)
<NoPermission> <message>Error Message</message> <ipAddr>IP Address</message> </NoPermission>Note that the IP address will only be included for requests that failed during IP-list authentication (and not for LDAP or external authentication, for example.)
<UnsupportedQuery> <message>Error Message</message> </UnsupportedQuery>Usually, this error is generated when a query tag has been used in a document request without the required index tag.
<GeneralExceptionName> <message>Error Message</message> <stackTrace>HTML-Formatted Java Stack Trace</stackTrace> </GeneralExceptionName>Exceptions are usually generated by anomalous fatal conditions like missing required files, corrupted indexes, files locked by other applications, or bugs in the XTF code itself.
YesOrNo | specifies whether or not raw XML search results are sent to your browser. If set to yes , the formatter stylesheet is disabled for the query, and raw XML search results or marked up document contents are sent to your browser. If set to no , the XML is sent to the formatter stylesheet for processing, and its output in turn is sent to the browser. If this URL parameter is not specified, it defaults to no. |
<xsl:value-of select="session:setData(Name, Value)"/>where
session: | is a namespace prefix to differentiate this function from any other. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.Session. |
Name | specifies the name under which to store the value in the session data. |
Value | specifies the particular value to store. It may be either a string, or a structured piece of XML with a single outer-level element (and any number of inner elements.) |
<xsl:variable name="Variable" select="session:getData(Name)"/>where
Variable | specifies the name of an XSLT variable to create. |
session: | is a namespace prefix to differentiate this function from any other. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.Session. |
Name | specifies the name to look up in the session data. |
<xsl:variable name="sessionID" select="session:getID()"/>where
session: | is a namespace prefix to differentiate this function from any other. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.Session. |
<xsl:if test="session:isEnabled()"/>where
session: | is a namespace prefix to differentiate this function from any other. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.Session. |
<xsl:variable name="Variable" select="session:encodeURL(RawURL)"/>where
Variable | specifies the name of an XSLT variable to create. This variable will contain the new, encoded, URL. |
session: | is a namespace prefix to differentiate this function from any other. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.Session. |
RawURL | specifies an expression or other variable containing the URL to be encoded. |
<xsl:if test="session:noCookie()"> <a href="javascript:alert('Cookies are disabled!')"> Requires Cookie! </a> </xsl:if>where
session: | is a namespace prefix to differentiate this function from any other. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.Session. |
External Command Extension Element
This element can be used inside any XTF stylesheet to call a command-line program, like this:<exec:run command = "CommandName" {timeout = "Milliseconds"} xsl:extension-element-prefixes="exec"> {<exec:arg>Argument1</exec:arg>} {<exec:arg>Argument2</exec:arg>} … {<exec:input> XmlOrString </exec:input>} </exec:run>where
exec: | is a namespace prefix identifying this particular Saxon extension. The namespace URI for this prefix must be: java:/org.cdlib.xtf.saxonExt.Exec. |
command="CommandName" | specifies the command-line program to run. In general, this should be an absolute path; if a relative path is given, it will be resolved in an undefined manner. |
timeout="Milliseconds" | is an optional attribute setting an upper limit, in milliseconds, on the amount of time the external process will be given to finish its work. If this time is exceeded, the process will be forcibly terminated, an a Java exception will be thrown (which terminates stylesheet processing immediately.) If this attribute is not specified, the process will be allowed to run to completion no matter how long it takes. |
<exec:arg> | is an optional sub-element that can be used repeatedly to specify command-line arguments to be passed to the program. Each argument should be specified in its own <exec:arg> element; in particular, pairs should usually be broken up. For example, to perform the command "ls -al *" one would specify two arguments, the first being "-al" and the second being "*". |
<exec:input> | is an optional sub-element that specifies the input to be send to the external program (input will be sent to the process stdio stream). If the content of the exec:input element is XML data (for instance, a variable holding one or more elements) then the servlet will automatically serialize the data into standard XML format, with UTF-8 character encoding. If the content is not XML, the string will be sent to the tool verbatim. |
<xsl:if test="FileUtils:exists(FilePath)"> … </xsl:if>where
FileUtils: | is a namespace prefix identifying this set of Saxon extension functions. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.FileUtils. |
FilePath | specifies the relative or absolute path to the file in question. If a relative path is specified, it will be resolved in relation to the stylesheet that calls the function. |
<xsl:variable name="myFileLen" select="FileUtils:length(FilePath)"> … </xsl:variable>where
FileUtils: | is a namespace prefix identifying this set of Saxon extension functions. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.FileUtils. |
FilePath | specifies the relative or absolute path to the file in question. If a relative path is specified, it will be resolved in relation to the stylesheet that calls the function. |
<xsl:variable name="myModTime" select="FileUtils:lastModified(FilePath, DateFormat)"> … </xsl:variable>where
FileUtils: | is a namespace prefix identifying this set of Saxon extension functions. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.FileUtils. |
FilePath | specifies the relative or absolute path to the file in question. If a relative path is specified, it will be resolved in relation to the stylesheet that calls the function. |
DateFormat | is a string representing the format that the date and/or time should be returned in. This uses codes from Java's SimpleDateFormat class such as yyyy-MM-dd:HH:mm:ss. Warning: "mm" and "MM" are different: the former is minutes, the latter is months. |
<xsl:variable name="myTime" select="FileUtils:curDateTime(DateFormat)"> … </xsl:variable>where
FileUtils: | is a namespace prefix identifying this set of Saxon extension functions. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.FileUtils. |
DateFormat | is a string representing the format that the date and/or time should be returned in. This uses codes from Java's SimpleDateFormat class such as yyyy-MM-dd:HH:mm:ss. Warning: "mm" and "MM" are different: the former is minutes, the latter is months. |
<xsl:variable name="xmlStub" select="FileUtils:readXMLStub(FilePath)"> … </xsl:variable>where
FileUtils: | is a namespace prefix identifying this set of Saxon extension functions. The namespace URI for this prefix must be: java:org.cdlib.xtf.xslt.FileUtils. |
FilePath | specifies the relative or absolute path to the file in question. If a relative path is specified, it will be resolved in relation to the stylesheet that calls the function. |
<xsl:for-each select="FileUtils:readXMLStub($file)"> <xsl:variable name="rn" select="name(*[1])"/> <xsl:variable name="pid" select="unparsed-entity-public-id($rn)"/> <xsl:variable name="uri" select="unparsed-entity-uri($rn)"/> <xsl:variable name="ns" select="namespace-uri(*[1])"/> <xsl:if test="matches($rn,'^TEI') or matches($pid,'TEI') or matches($uri,'tei2\.dtd') or matches($ns,'tei')"> This must be a TEI document!... </xsl:if> … <xsl:for-each>In the example stylesheets you can see it in action in both the docSelector and docReqParser.
<redirect:send url="TargetURL" xmlns:redirect="java:/org.cdlib.xtf.saxonExt.Redirect" xsl:extension-element-prefixes="redirect"/>where
url="TargetURL" | is a required attribute specifying the URL that the user's browser should be redirected to. An absolute or relative URL may be specified. If relative, the URL will be resolved by the servlet container to an absolute URL. |
redirect: | is a namespace prefix identifying this Saxon extension instruction. The namespace URI for this prefix must be: java:/org.cdlib.xtf.saxonExt.Redirect . In addition, it must be declared in the list of extension-element-prefixes for Saxon. The declarations are most easily done in-line as shown above. |
<parameters> ParameterBlock ParameterBlock … </parameters>
<param name="ParamName" value="ParamValue"> Token | Phrase Token | Phrase … </param>where
name="ParamName" | is the name of the parameter extracted from the original query URL. |
value="ParamValue" | is the original text in the query URL that is assigned to the specified parameter. |
<xsl:param name="ParamName" select="DefaultValueIfNotInURL"/>This allows query parameters to be accessed either through the standard template driven XML or through stylesheet parameters.
<token value="Word" isWord="YesOrNo"/>where
value="Word" | is the actual word or symbol extracted from the URL. |
isWord="YesOrNo" | identifies whether the token is a word (isWord="yes") or a punctuation symbol (isWord="no".) |
<phrase value="StringOfWords"> Token Token … </phrase>where
value="StringOfWords" | is the entire phrase extracted from the URL as a single string. |
Token, Token... | is the original phrase broken down into one or more Token Tags (see above), one for each word or symbol in the phrase.) |
<xsl:param name="http.URL"/>This parameter contains a URL string of the form: http://yourserver/yourport/servlet/queryparms where
yourserver | is the name of your XTF server |
yourport | is the port through which XTF requests are routed (typically 8080) |
servlet | is the name of the servlet to which the request is being sent. Normally, this is either search (for crossQuery) or view (for dynaXML) |
queryparms | is the list of parameters that defines the actual request being sent to the servlet. All URL escape codes (such as %20 for space) will have been translated to normal characters, and UTF-8 octet sequences will have been decoded. |
<xsl:param name="servlet.dir"/>Typically this value comes from the servlet container (e.g. Resin or Tomcat), but may be overridden by specifying a base-dir parameter in the servlet container configuration. As this varies by container, check the documentation for your servlet container if you wish to override this value.
<xsl:param name="servlet.URL"/>This parameter contains a URL string of the form: http://yourserver{:yourport}/xtf/servlet where
yourserver | is the name of your XTF server |
yourport | is the port through which XTF requests are routed (typically 8080) |
servlet | is the name of the servlet to which the request is being sent. Normally, this is either search (for crossQuery) or view (for dynaXML) |
<xsl:param name="root.URL"/>This parameter contains a URL string of the form: http://yourserver{:yourport}/xtf/ where
yourserver | is the name of your XTF server |
yourport | is the port through which XTF requests are routed (typically 8080) |
<xsl:param name="http.user-agent"/>This parameter contains a string that identifies the browser that made the current request. Most HTTP requests will provide this field. For example:
Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt; empas)Note that the contents of this field vary widely depending on which browser made the request, and a detailed description is beyond the scope of this document.
<xsl:param name="http.referer">This parameter holds a URL string identifying the web page that issued the request. Often the initial request to the servlet will supply this parameter; subsequent requests will not.
<xsl:param name="http.if-modified-since"/>Most HTTP responses will not include this parameter, but if they do, the contents of this time string of the form: weekday, dd-mmm-yy hh:mm:ss timezone where
weekday | is the day of the week the request was last issued |
dd-mmm-yy | is the day, three letter month abbreviation, and year the request was last issued |
hh:mm:ss | is the time the request was last issued, represented as a 24 hour GMT based time |
timezone | is the offset in hours of the timezone from which the request was last issued. |
<xsl:param name="http.field-name">where field-name is the name of the HTTP header field. Note that most HTTP header fields will be absent most of the time.
<xsl:param name="ElementName.AttributeName/>For a concrete example, see the Programming Guide.
<route> QueryParserTag {ErrorGenTag} </route>The QueryParserTag (see below) specified within this tag identifies the Query Parser stylesheet to use. If specified, the ErrorGenTag (see below) identifies an Error Generator stylesheet to use instead of the default specified in the crossQuery.conf file.
<queryParser path="QueryParserLocation"/>This tag identifies which Query Parser stylesheet should be utilized by crossQuery to parse the URL parameters and produce an XTF query. If this path is not specified as an absolute path, it is assumed to be relative to the XTF base installation directory (i.e., XTF_HOME.)
<errorGen path="ErrorGeneratorLocation"/>This tag identifies which stylesheet should be used in case unexpected errors occur during query parsing or processing. This overrides the default error generator specified in the crossQuery.conf configuration file. If this path is not specified as an absolute path, it is assumed to be relative to the XTF base installation directory (i.e., XTF_HOME.)
<query> <error> <term> <phrase> <exact> <and> <or> <orNear> <not> <near> <range> <sectionType> <facet> <spellcheck> <resultData> <moreLike> <allDocs>
<query indexPath = "IndexDBLocation" style = "ResultFormatterLocation" {sortDocsBy = "ListOfMetaFields|score"} {startDoc = "FirstDocToReturn"} {maxDocs = "MaxDocsToReturn"} {termLimit = "MaxTermsToAllow"} {workLimit = "MaxWorkToAllow"} {maxContext = "MaxContextChars"} {maxSnippets = "SnippetsToOutput"} {termMode = "TermMarkMode"} {field = "FieldToSearch"} {normalizeScores = "TrueOrFalse"} {explainScores = "TrueOrFalse"}> QueryElement </query>where
indexPath="IndexDBLocation" | is the path to the Index Database to use when performing the search. If this path is not specified as an absolute path, it is assumed to be relative to the XTF base installation directory (i.e., XTF_HOME.) |
style="ResultFormatterLocation" | is the path to the Result Formatter stylesheet to use to display the results generated by the current query. If this path is not specified as an absolute path, it is assumed to be relative to the XTF base installation directory (i.e., XTF_HOME.) |
sortDocsBy="ListOfMetaFields|score" | is an optional attribute specifying a list of meta fields by which to sort the results. The list should consist of a quoted string containing one or more meta-field names, separated by commas. If multiple meta-fields are specified, the results are sorted first by the left-most meta-field, then sub-sorted by subsequent fields to produce the final output. Optionally, each meta-field name can be preceded by a plus sign (+) or a minus sign (-) to indicate whether the results for that field should be sorted in ascending or descending order. If no plus or minus sign is specified for a meta-field, then the results are sorted in ascending order by default. If this attribute not specified, documents are by default sorted in order of decreasing score (so the most "relevant" documents are first.) With XTF 3.0, this default behavior can now also be explicitly set by providing a value of "score" (or synonym “relevance”). (Note: Meta tags to be used for sorting queries should also have an xtf:tokenize="no" attribute set, or sorting will produce unpredictable results.) (Compatibility note: This attribute was previously called "sortMetaFields", and this old name is still accepted to retain backward compatibility.) |
startDoc="FirstDocToReturn" | is an optional attribute specifying the ordinal number of the first matching document to pass on to the Result Formatter. If not specified, this attribute defaults to 1, meaning the first document that contains matches for the specified query. |
maxDocs="MaxDocsToReturn" | is an optional attribute specifying the number of matching documents to pass on to the Result Formatter. If not specified, this attribute defaults to 10, meaning that up to 10 documents with matches will be returned for the specified query. The special value "all" may be used to indicate that all matching documents should be returned. |
termLimit="MaxTermsToAllow" | is an optional attribute that limits the number of terms permitted in a query. If not specified, this attribute defaults to 50. This attribute is used primarily to prevent wildcard expansions like <term>a*</term> from overloading the crossQuery servlet. If the query does in fact exceed the limit specified by this attribute, a TermLimit error is sent to the Error Generator stylesheet for the offending query. |
workLimit="MaxWorkToAllow" | is an optional attribute that limits the amount of "work" that may be performed in a query. If not specified, this attribute defaults to -1, meaning no limit is enforced. This attribute is used primarily to prevent queries from overloading the crossQuery servlet, which would adversely impact the responsiveness of the XTF system. If a query exceeds the work limit set by this attribute, a ExcessiveWork error is sent to the Error Generator stylesheet for the offending query. For the crossQuery servlet, one unit of "work" is equivalent to finding a single matching term in a single document. Experimentally, a value of 500,000 for this attribute seems to work well. |
maxContext="MaxContextChars" | identifies the size of a snippet to pass in the Result Formatter snippet tag. If not specified, this attribute defaults to 80 characters. Note that the context length is the total number of characters for the snippet, which includes both the matched text and the context text surrounding it. |
maxSnippets="SnippetsToOutput" | identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Note that -1 is a special value. If maxSnippets is set to -1, it requests that all snippets for a document be returned. If the maxSnippets attribute is set by the <query> tag, any occurrences of the maxSnippets attribute set by the inner tags must match the value set by the <query> tag. Otherwise, an error will be generated. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
termMode="TermMarkMode" | is an optional parameter that specifies how terms should be marked in the search results. Valid values for this option are: "none": Do not mark matching terms anywhere in the search results. "hits": Mark matching terms in the search results only if they appear inside <hit> tags. "context": Mark matching terms in the search results only if they appear inside <snippet> or <hit> tags. "all": Mark matching terms in search results anywhere they occur. If not specified, the default value used is "hits". Note that when term marking is enabled, matching terms in the search results are placed inside <term> ... </term> tag sets. |
field="FieldToSearch" | is an optional parameter that identifies which field in the index to search. This can be set to text to indicate that the main text of the document should be searched, or it can name a meta-data field such as creator or subject. If child elements specify their own field attributes, their field name must agree with the parent element. (Note: Allowing nested copies of the field attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost field name is always used.) |
normalizeScores="TrueOrFalse" | is an optional parameter that can disable score normalization, the process that converts all document scores to be relative to the highest ranking document (which receives a score of 100). If set to no or false, the scores will be raw floating point numbers. The default, true or yes, will normalize scores to the range of 0..100, and round them to whole numbers. In the default XTF stylesheets, one can simply add ";normalizeScores=1" to the query URL, and the default Query Parser will set this attribute for you. |
explainScores="TrueOrFalse" | is an optional parameter that causes XTF to output a structured, detailed explanation of how the score for each document was calculated. This means that each Document Hit Tag in the query result will contain an Score Explanation Tag, which in turn contains other Score Explanation Tags describing how the components of that score. In the default XTF stylesheets, one can simply add ";explainScores=1" to the query URL, and the default Query Parser will set this attribute for you. This is an advanced feature, as XTF's scoring is fairly complex and can be confusing to those just starting out. For an overview of how XTF scores document hits, see the Scoring section of the document XTF Under the Hood. Note that if you enable this attribute, you should generally disable normalizeScores above, as the score explanations describe the non-normalized score for each document. |
<query … startDoc="51" maxDocs="50"> … </query>The QueryElement specified within this tag identifies the query to be performed. This can be a term query tag, or a phrase, exact, and, or, orNear, near, range, resultData, spellcheck, or not tag. Note that the Query Parser stylesheet can issue a single top-level error tag instead of a query tag if it encounters any errors.
<error message="Error Message"/>where Error Message is an error string describing the error. This tag is a top level tag, and should be issued by itself in place of the normal query tag when an error occurs. Once issued, the error will be routed to the Error Generator stylesheet for processing.
<term {field = "FieldName"} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"}> WordToFind {OptionalSectionTypeQuery} </term>where
field="FieldName" | is an optional attribute that identifies which field in the index to search. Often this attribute is set to text to indicate that the main text of the document should be searched. It can also be set to the name of a meta field such as author or subject. It should be mentioned that the field name specified by a <term> tag must match the field name set by any tags that contain it. Otherwise, an error will be generated. |
maxSnippets="SnippetsToOutput" | is an optional attribute that identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, the maxSnippets specified by a <term> tag must match the value set by any tags that contain it. Otherwise, an error will be generated. |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this term in the query. Boost values higher than 1.0 increase the relevance of a term, while boost values between 0.0 and 1.0 decrease the relevance of a term. Boost values less than zero will generate an error. |
<phrase {field = "FieldName"} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"}> Term | Clause Term | Clause … {OptionalSectionTypeQuery} </phrase>where
field="FieldName" | is an optional attribute that identifies which field in the index to search. Often this attribute is set to text to indicate that the main text of the document should be searched. It can also be set to the name of a meta field such as author or subject. It should be mentioned that the field name specified by a <phrase> tag must match the field name set by any tags that contain it. Otherwise, an error will be generated. If no parent tags specify a field name, any field name can be used by the <phrase> tag. Similarly, any tags within a <phrase> tag must either specify the same field name as the phrase, or specify no field name at all. Otherwise, an error will be generated. (Note: Allowing nested copies of the field attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost field value is always used.) |
maxSnippets="SnippetsToOutput" | identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, any maxSnippets value set by the <phrase> tag must match the value set by any tags that contain it. Otherwise, an error will be generated. If no parent tags set a maxSnippets value, then any value may be specified by the <phrase> tag. Similarly, any tags within a <phrase> tag must either specify the same field name as the phrase, or specify no field name at all. Otherwise, an error will be generated. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this phrase in the query. Boost values higher than 1.0 increase the relevance of a phrase, while boost values between 0.0 and 1.0 decrease the relevance of a phrase. Boost values less than zero will generate an error. Note that boost values multiply. That is, if tags within a <phrase> tag have boost attributes, their individual boost values will be multiplied by the boost set for the containing <phrase> tag. |
<exact {field = "FieldName"} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"}> Term | Clause Term | Clause … {OptionalSectionTypeQuery} </exact>where
field="FieldName" | is an optional attribute that identifies which field in the index to search. Normally, this attribute is set to the name of a meta field such as author or subject. It may not be set to text. It should be mentioned that the field name specified by a <exact> tag must match the field name set by any tags that contain it. Otherwise, an error will be generated. If no parent tags specify a field name, any field name can be used by the <exact> tag. Similarly, any tags within an <exact> tag must either specify the same field name as the phrase, or specify no field name at all. Otherwise, an error will be generated. (Note: Allowing nested copies of the field attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost field value is always used.) |
maxSnippets="SnippetsToOutput" | is an optional attribute that identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, any maxSnippets value set by the <exact> tag must match the value set by any tags that contain it. Otherwise, an error will be generated. If no parent tags set a maxSnippets value, then any value may be specified by the <exact> tag. Similarly, any tags within a <exact> tag must either specify the same field name as the phrase, or specify no field name at all. Otherwise, an error will be generated. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this phrase in the query. Boost values higher than 1.0 increase the relevance of a phrase, while boost values between 0.0 and 1.0 decrease the relevance of a phrase. Boost values less than zero will generate an error. Note that boost values multiply. That is, if tags within an <exact> tag have boost attributes, their individual boost values will be multiplied by the boost set for the containing <exact> tag. |
<and {field = "FieldName" | fields = "Field1,Field2,..."} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"} {useProximity = "YesOrNo"}> Term | Clause Term | Clause … {OptionalSectionTypeQuery} </and>where
field="FieldName" | is an optional attribute that identifies which field in the index to search. Often this attribute is set to text to indicate that the main text of the document should be searched. It can also be set to the name of a meta field such as author or subject. It should be mentioned that if the field attribute appears in any tags outside it, the field name in the <and> tag must match the value set in the outer tags. Otherwise, an error will be generated. However, if the <and> tag does not specify a field name, and no tags outside it specify a field name, then the tags directly below the <and> tag may have any combination of field names desired. This is how mixed queries of document text and meta data are formed. |
fields="Field1,Field2,..." | is a multi-field alternative to the field attribute. It identifies a list of fields in the index to search, instead of a single field. Using a list of fields is an ideal way to perform a "keyword" search, for example searching the title, subject, author, and full text of documents. It should be mentioned that the fields attribute is only applicable to <and> and <or> queries. As with the field attribute, any elements nested within the tag are not allowed to specify a field or fields. |
maxSnippets="SnippetsToOutput" | is an optional attribute that identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, if the maxSnippets attribute appears in any containing tags, the value set for it in the <and> tag must match the value set in the containing tags. Otherwise, an error will be generated. Similarly, any occurrences of the maxSnippets attribute in tags within the <and> tag must have the same value as their containing tags. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this clause in the query. Boost values higher than 1.0 increase the relevance of a clause, while boost values between 0.0 and 1.0 decrease the relevance of a clause. Boost values less than zero will generate an error. Note that boost values multiply. That is, if tags within an <and> tag have boost attributes, their individual boost values will be multiplied by the boost set for the containing <and> tag. |
useProxmity="YesOrNo" | is an optional attribute that specifies whether the AND query should take the proximity of terms into account. Generally it's best to leave this on (the default) as it results in higher quality results for the user. However, turning it off can increase query processing speed, as the Text Engine will have less work to do to calculate which documents match the query. If not specified, this attribute defaults to Yes, that is, proximity will be taken into account. Note that if proximity processing is turned off, individual text hits within document text and meta-data fields will not be highlighted, and scores for matching documents will be somewhat different. |
<or {field = "FieldName" | fields = "Field1,Field2,..."} {slop = "MaxMatchDistance"} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"}> Term | Clause Term | Clause … {OptionalSectionTypeQuery} </or>where
field="FieldName" | is an optional attribute that identifies which field in the index to search. Normally, this attribute is set to text to indicate that the main text of the document should be searched. It can also be set to the name of a meta field such as author or subject. It should be mentioned that if the field attribute appears in any tags outside it, the field name in the <or> tag must match the value set in the outer tags. Otherwise, an error will be generated. However, if the <or> tag does not specify a field name, and no tags outside it specify a field name, then the tags directly below the <or> tag may have any combination of field names desired. This is how mixed queries of document text and meta data are formed. |
fields="Field1,Field2,..." | is a multi-field alternative to the field attribute. It identifies a list of fields in the index to search, instead of a single field. Using a list of fields is an ideal way to perform a "keyword" search, for example searching the title, subject, author, and full text of documents. It should be mentioned that the fields attribute is only applicable to <and> and <or> queries. As with the field attribute, any elements nested within the tag are not allowed to specify a field or fields. |
slop="MaxMatchDistance" | is a measure of the "nearness" of the terms or clauses within a given field of a multi-field query. Note that this attribute is required for multi-field <or> queries, and not allowed for single-field <or> queries. To get an idea of what slop does, see the discussion of this attribute for the related <orNear> query, following this query. |
maxSnippets="SnippetsToOutput" | is an optional attribute that identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, if the maxSnippets attribute appears in any containing tags, the value set for it in the <or> tag must match the value set in the containing tags. Otherwise, an error will be generated. Similarly, any occurrences of the maxSnippets attribute in tags within the <or> tag must have the same value as their containing tags. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this clause in the query. Boost values higher than 1.0 increase the relevance of a clause, while boost values between 0.0 and 1.0 decrease the relevance of a clause. Boost values less than zero will generate an error. Note that boost values multiply. That is, if tags within an <or> tag have boost attributes, their individual boost values will be multiplied by the boost set for the containing <or> tag. |
<orNear slop = "MaxMatchDistance" {field = "FieldName"} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"}> Term | Clause Term | Clause … {OptionalSectionTypeQuery} </orNear>where
slop="MaxMatchDistance" | is a measure of the "nearness" of the terms or clauses specified. See the discussion of slop calculation below. |
field="FieldName" | is an optional attribute that identifies which field in the index to search. Often this attribute is set to text to indicate that the main text of the document should be searched. It can also be set to the name of a meta field such as author or subject. It should be mentioned that the field name specified by a <orNear> tag must match the field name set by any tags that contain it. Otherwise, an error will be generated. If no parent tags specify a field name, any field name can be used by the <orNear> tag. Similarly, any tags within a <orNear> tag must either specify the same field name as the orNear clause, or specify no field name at all. Otherwise, an error will be generated. (Note: Allowing nested copies of the field attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost field value is always used.) |
maxSnippets="SnippetsToOutput" | is an optional attribute that identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, if the maxSnippets attribute appears in any containing tags, the value set for it in the <orNear> tag must match the value set in the containing tags. Otherwise, an error will be generated. Similarly, any occurrences of the maxSnippets attribute in tags within the <orNear> tag must have the same value as their containing tags. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this clause in the query. Boost values higher than 1.0 increase the relevance of a clause, while boost values between 0.0 and 1.0 decrease the relevance of a clause. Boost values less than zero will generate an error. Note that boost values multiply. That is, if tags within an <orNear> tag have boost attributes, their individual boost values will be multiplied by the boost set for the containing <orNear> tag. |
<not {field = "FieldName"} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"}> Term | Clause Term | Clause … </not>where
field="FieldName" | is an optional attribute that identifies which field in the index to search. Normally, this attribute is set to text to indicate that the main text of the document should be searched. It can also be set to the name of a meta field such as author or subject. It should be mentioned that if the field attribute appears in any tags outside it, the field name in the <not> tag must match the value set in the outer tags. Otherwise, an error will be generated. However, if the <not> tag does not specify a field name, and no tags outside it specify a field name, then the tags directly below the <not> tag may have any combination of field names desired. This is how mixed queries of document text and meta data are formed. |
maxSnippets="SnippetsToOutput" | is an optional attribute that identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, if the maxSnippets attribute appears in any containing tags, the value set for it in the <not> tag must match the value set in the containing tags. Otherwise, an error will be generated. Similarly, any occurrences of the maxSnippets attribute in tags within the <not> tag must have the same value as their containing tags. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this clause in the query. Boost values higher than 1.0 increase the relevance of a clause, while boost values between 0.0 and 1.0 decrease the relevance of a clause. Boost values less than zero will generate an error. Note that boost values multiply. That is, if tags within an <not> tag have boost attributes, their individual boost values will be multiplied by the boost set for the containing <not> tag. |
<near slop = "MaxMatchDistance" {field = "FieldName"} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"}> Term | Clause Term | Clause … {OptionalSectionTypeQuery} </near>where
slop="MaxMatchDistance" | is a measure of the "nearness" of the terms or clauses specified. See the discussion of slop values below. |
field="FieldName" | is an optional attribute that identifies which field in the index to search. Often this attribute is set to text to indicate that the main text of the document should be searched. It can also be set to the name of a meta field such as author or subject. It should be mentioned that the field name specified by a <near> tag must match the field name set by any tags that contain it. Otherwise, an error will be generated. If no parent tags specify a field name, any field name can be used by the <near> tag. Similarly, any tags within a <near> tag must either specify the same field name as the near clause, or specify no field name at all. Otherwise, an error will be generated. (Note: Allowing nested copies of the field attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost field value is always used.) |
maxSnippets="SnippetsToOutput" | is an optional attribute that identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, if the maxSnippets attribute appears in any containing tags, the value set for it in the <near> tag must match the value set in the containing tags. Otherwise, an error will be generated. Similarly, any occurrences of the maxSnippets attribute in tags within the <near> tag must have the same value as their containing tags. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this clause in the query. Boost values higher than 1.0 increase the relevance of a clause, while boost values between 0.0 and 1.0 decrease the relevance of a clause. Boost values less than zero will generate an error. Note that boost values multiply. That is, if tags within an <near> tag have boost attributes, their individual boost values will be multiplied by the boost set for the containing <near> tag. |
<range {inclusive = "YesOrNo"} {numeric = "YesOrNo"} {field = "FieldName"} {maxSnippets = "SnippetsToOutput"} {boost = "BoostValue"}> <lower>FirstTermToFind</lower> <upper>LastTermToFind</upper> {OptionalSectionTypeQuery} </range>where
inclusive="YesOrNo" | is an optional attribute that specifies whether the range should include the first and last term when matching. If not specified, this attribute defaults to yes. |
numeric="YesOrNo" | is an optional attribute that specifies whether the data in the field is numeric and in a rigid consistent format. If set to yes, upon the first range query on the field, XTF will read into memory a table of all the data values, converting them to 64-bit integers; subsequent queries can then be processed extremely efficiently. If this attribute is set to no, the query is much more tolerant of variable formatting in the data; XTF will expand the range query into a multi-term OR (just like a wildcard query.) However, for some types of data, wildcard expansion can result in too many terms for the engine to handle. If not specified, this attribute defaults to no. |
field="FieldName" | is an optional attribute that identifies which field in the index to search. Often this attribute is set to text to indicate that the main text of the document should be searched. It can also be set to the name of a meta field such as author or subject. It should be mentioned that if the field attribute appears in any tags outside it, the field name in the <range> tag must match the value set in the outer tags. Otherwise, an error will be generated. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
maxSnippets="SnippetsToOutput" | is an optional attribute that identifies the number of snippets to pass on to the Result Formatter stylesheet for display. A snippet is defined as the matching text found in a document for a particular query, along with some additional text around it for context. The amount of context displayed for each match is defined by the maxContext attribute. If not specified this attribute defaults to 3, meaning snippets for the top three matches for a document are returned. Also, this attribute can be set to -1, meaning all the snippets for a document are returned. As with the field attribute, if the maxSnippets attribute appears in any containing tags, the value set for it in the <range> tag must match the value set in the containing tags. Otherwise, an error will be generated. (Note: Allowing nested copies of the maxSnippets attribute doesn't serve a purpose other than to make it easier to write docReqParser.xsl stylesheets in a uniform way. Effectively, the outermost maxSnippets value is always used.) |
boost="BoostValue" | is an optional attribute that specifies a relevance boost multiplier for this clause in the query. Boost values higher than 1.0 increase the relevance of a clause, while boost values between 0.0 and 1.0 decrease the relevance of a clause. Boost values less than zero will generate an error. |
FirstTermToFind | is the first term in the range to find. Usually, this is a starting number, date, or year. |
LastTermToFind | is the last term in the range to find. Usually, this is an ending number, date, or year. |
<sectionType> Term | Clause </sectionType>This tag may only be used within another query, and has the effect of limiting that query to search only those parts of the full document text whose sectionType attributes match the specified term or clause. The sectionType tag is only allowed within queries on the "text" field. XTF evaluates the specified term or clause against the section type attributes recorded at index time by the Pre-Filter stylesheet using the xtf:sectionType attribute. For example, if one wanted the option to only search the chapter headings of all books in a repository, then the pre-filter would be modified to mark all the headings with xtf:sectionType="heading", and then a sectionType tag would be added within the main text query, containing term tag on the word "heading". Within the sectionType tag, Term is a term tag, and Clause is a phrase, exact, and, or, orNear, near, range, resultData, or not tag.
<facet field = "FieldName" {select = "GroupsToSelect"} {sortGroupsBy = "SortKind"} {sortDocsBy = "ListOfMetaFields|score|totalHits"} {includeEmptyGroups = "YesOrNo"} />where
field="FieldName" | is a required attribute that identifies which meta-data field in the index for which to count and build groups. (Note: Meta tags to be used for faceted queries should also have an xtf:tokenize="no" attribute set, or sorting will produce unpredictable results.) |
select="GroupsToSelect" | is an optional attribute specifying a subset of groups to select and return in the query result. For maximum flexibility, this specification is made using a special language that resembles XPath to some extent. It allows selecting groups by name or position in the list, and supports various operations on hierarchical meta-data. See examples below. If this attribute is not specified, it defaults to: * |
sortGroupsBy="TotalDocsOrValue" | is an optional attribute telling XTF the order in which to sort groups.
|
sortDocsBy= "ListOfMetaFields|score|totalHits" |
is an optional attribute specifying a list of meta fields by which to sort the results. The list should consist of a quoted string containing one or more meta-field names, separated by commas. If multiple meta-fields are specified, the results are sorted first by the leftmost meta-field, then sub-sorted by subsequent fields to produce the final output. Optionally, each meta-field name can be preceded by a plus sign (+) or a minus sign (-) to indicate whether the results for that field should be sorted in ascending or descending order. If no plus or minus sign is specified for a meta-field, then the results are sorted in ascending order by default. An additional option is available: setting this attribute to "totalHits" will order the results by descending number of hits within each document. That is, a document with more hits will appear before one with fewer hits, regardless of the quality of those hits. If this attribute is not specified, documents are by default sorted in order of decreasing score (so the most "relevant" documents are first.) This default behavior can also be explicitly set by providing a value of "score" (or synonym “relevance”). (Note: Meta tags to be used for sorting should also have an xtf:tokenize="no" attribute set, or sorting will produce unpredictable results.) |
includeEmptyGroups="YesOrNo" | is an optional attribute that specifies whether to include empty groups in the results. If set to "yes", empty groups will be included. If set to "no" they will be excluded. If this attribute is not specified, it defaults to "no." |
*[1-5] Politics#all **[topChoices] US::Berkeley#all|US::* History#all|**[selected][page(size=5)]For more information, refer to the Group Selection section of the XTF Programmer's Guide.
<spellcheck {fields = "FieldNames" } {docScoreCutoff = "MaxDocScore"} {totalDocsCutoff = "MaxDocCount"}/>where
fields="FieldNames" | is an optional attribute that restricts spelling correction to the specified set of fields. The field names can be separated by commas, semicolons, pipe symbols (|), or spaces. If not specified, or if set to the special value #all, then all tokenized fields in the index will be checked for spelling (including the special field text which contains all words not marked as meta-data.) Specifying a subset of fields can speed up query processing if the Query Parser stylesheet introduces extra fields that the user didn't explicitly type, and thus needn't be checked for spelling. |
docScoreCutoff="MaxDocScore" | is an optional attribute that controls whether XTF performs spelling correction. If any document resulting from the query scores higher than this number, no correction will be performed. If not specified, this attribute defaults to 0.0, which disables the score cutoff. |
totalDocsCutoff="MaxDocCount" | is an optional attribute that controls whether XTF performs spelling correction. If the number of documents resulting from the query exceeds this number, spelling correction will not be performed. If set to zero, the document count cutoff is disabled (correction will always be considered.) If not specified, this attribute defaults to 10, meaning that if less than 10 documents are found by a query, spelling correction is applied. |
<resultData> YourDataHere </resultData>XTF will pass the tag and its contents unchanged to the result formatter.
<moreLike fields = "FieldList" {boosts = "BoostFactorList"} {minWordLen = "MinWordLength"} {maxWordLen = "MaxWordLength"} {minDocFreq = "MinDocFrequency"} {maxDocFreq = "MaxDocFrequency"} {minTermFreq = "MinTermFrequency"} {termBoost = "ShouldBoostTerms"} {maxQueryTerms = "MaxQueryTerms"}> DocumentQuery </moreLike>where
fields="FieldList" | is a required attribute naming all of the fields that XTF should search for "interesting" terms. The field names may be separated by spaces, commas, semicolons, or pipe symbols "|". For best performance, this list should be kept relatively small, and concentrate on fields of most interest to users, such as title, author, subject, etc. Note that XTF currently does not support using the special field name text to search the full document text for interesting terms, and behavior is undefined if you specify this as a field name. |
boosts="BoostFactorList" | is an optional attribute specified exactly one boost factor for each field listed in the fields attribute. Each boost factor should be a non-negative decimal number, and is multiplied into the scoring for all terms from the given field. For example, a boost factor of 0.5 will reduce the score for terms by half, while a factor of 2.0 will double the score. In general, the boost factor is very useful in adjusting the weight that various fields have on selecting similar documents. For instance, if one decided that the title should be twice as important as author and subject, the fields attribute might be "title,author,subject" and the boosts attribute would be "2.0,1.0,1.0". If not specified, the boost factor for all fields in the fields list is set to 1.0. |
minWordLen="MinWordLength" | is an optional attribute that limits the length of terms from the source fields that will be considered for similarity matching. Terms shorter than the specified number of characters will be disregarded. This can speed up processing and improve results by getting rid of useless words. If not specified, this attribute defaults to 4. |
maxWordLen="MaxWordLength" | is an optional attribute that limits the length of terms from the source fields will be considered for similarity matching. Terms longer than the specified number of characters will be disregarded. This can speed up processing and improve results by getting rid of useless words. If not specified, this attribute defaults to 12. |
minDocFreq="MinDocFrequency" | is an optional attribute that helps select which terms from source fields will be considered for similarity matching. In particular, terms that appear in fewer than the specified number of documents will be discarded. This can speed processing and improve results by discarding highly unusual terms. If not specified, this attribute defaults to 2. |
maxDocFreq="MaxDocFrequency" | is an optional attribute that helps select which terms from source fields will be considered for similarity matching. In particular, terms that appear in more than the specified number of documents will be discarded. This can speed processing and improve results by discarding very common terms. If not specified, this attribute defaults to -1, meaning that there is no limit at all. |
minTermFreq="MinTermFrequency" | is an optional attribute that helps select which terms from source fields will be considered for similarity matching. In particular, if the term occurs in the original field less than the specified number of times, it will be discarded. This can help choose more relevant terms by concentrating on those that are repeated in the field. If not specified, this attribute defaults to 1. |
termBoost="ShouldBoostTerms" | is an optional attribute controls whether the similarity engine should calculate and attach a boost factor to each term. This factor will be equal to the score that was calculated for that term, and serves to make more important terms select documents more specifically. In general, it's best to leave this at the default value, which is true. |
maxQueryTerms="MaxQueryTerms" | is an optional attribute that controls how many "interesting" terms are selected from the original document's fields. Generally, this should be chosen to balance speed (more terms take longer to process) vs. quality (more terms can result in higher quality results, up to a point.) If not specified, this attribute defaults to 10. |
<allDocs/>
<crossQueryResult> <docHit> <meta> <snippet> <hit> <term> <explanation> <facet> <group> <spelling> <suggestion>
<crossQueryResult queryTime = "TimeInSeconds" totalDocs = "NumberOfDocs" startDoc = "FirstDocNumber" endDoc = "LastDocNumber"> Parameters Query Spelling <!-- if spelling corrections requested and applicable --> DocumentHit DocumentHit … FacetResult <!-- if facets were queried --> FacetResult </crossQueryResult>where
queryTime="TimeInSeconds" | is the amount of time, in seconds, that the servlet spent parsing the query, processing it, and gathering the results. |
totalDocs="NumberOfDocs" | is the number of documents that had matches for the specified query. |
startDoc="FirstDocNumber" | is the sequential document number for the highest ranking document returned by the current Query. Note that this may not be the overall highest ranking document if a paged query was specified. See the query tag for more details about performing paged queries. |
endDoc="LastDocNumber" | is the sequential document number for the lowest ranking document match returned by the current query. Note that this may not be the overall lowest ranking document if a paged query was specified. See the query tag for more details about performing paged queries. |
Parameters | is the same <parameters> block that was sent to the Query Parser stylesheet |
Query | is the query block that was output by the Query Parser stylesheet, included for reference. This <query> block and the <parameters> block may be useful to the Search Result Formatter stylesheet in formulating its output. |
Spelling | is a tag that will appear only if spell checking was enabled in the query, and a spelling correction dictionary was created at index time (i.e. enabled in textIndexer.conf), and the engine detected likely misspelled words and suitable suggested replacements. See the Spelling Correction Result Tag tag for more information. |
DocumentHit | is one or more tags, one per document in which search hits were found. These tags contain specific hits for each document. See the Document Hit Tag |
FacetResult | is a tag that will be included only if a facet query was performed. These tags will contain grouped counts of the matching documents. See the Facet Result Tag for more information. |
<docHit rank = "DocRelevanceRank" path = "DocumentLocation" score = "DocRelevanceScore"> DocumentMetaData Snippet Snippet … {ScoreExplanation} </docHit>where
rank="DocRelevanceRank" | is the ordinal ranking of this matching document, with 1 being the most relevant document for a query. Note that this is an absolute ranking for the document with respect to the entire query, and not a page relative ranking. For more information about paged queries, see the query tag. |
path="DocumentLocation" | is the file path and name for the matching document. This path is relative to the base XTF directory (i.e., XTF_HOME.) |
score="DocRelevanceScore" | is the document relevance score ranging from 0% to 100%. The document with the highest overall relevance will receive a score of 100%, and less relevant documents will receive lower scores. Note that this score is an overall relevance score and is not affected by paging. For information about paged queries, see the query tag. |
Document MetaData | is a tag which contains the meta-data for the matching document. See the Document Meta-Data Tag for more information. |
Snippet | are tag(s) that contain the text of individual hits within the document, along with surrounding context, as requested in the query. See the Snippet Tag for more information. |
Score Explanation | are tags only present if score explanation was requested in the query. These tags detail how the textEngine calculated the score for this document. See the Score Explanation Tag for more information. |
<meta> … </meta>The actual tags within the meta-data block will depend on the implementation of the Pre-Filter used by the textIndexer. However, any matches found in the meta-data will be marked with Snippet or Hit tags, depending on what the query specified (snippets by default.). The document meta-data tag is always included in the query results, regardless of whether there are any matches in it or not. This guarantees that the result formatter has access to the title of the document and other document related information in addition to the match results.
<snippet rank="MatchRelevanceRank" score="MatchRelevanceScore"> Hit Text (and context text, if any) </snippet>where
rank="MatchRelevanceRank" | is the ordinal ranking of this match in the current document, with 1 being the most relevant match in the document. Note that this is an absolute ranking for the match with respect to the entire document, and not a page relative ranking. For more information about paged queries, see the query tag. |
score="MatchRelevanceScore" | is the relevance score for this match ranging from 0% to 100%. The snippet with the highest overall relevance will receive a score of 100%, and less relevant snippets will receive lower scores. Note that this score is an overall relevance score and is not affected by paging. For information about paged queries, see the query tag. |
<hit> … </hit>A hit may contain one or more matched words, which are separately marked with term tags. If the original query used a near or and clause, the hit tag will mark the entire range of text between the first and last word found for the clause. For example, for a "man" near "war" query, the result would look like this:
<snippet rank="3" score="86"> that <hit><term>man</term> had never actually been to <term>war</term></hit>, but he spoke as if he had </snippet>For an OR query, if multiple matched words exist in the snippet, all the matched words will be marked with Term tags, but only the word that this snippet is centered around will be marked with a hit tag.
<term> MatchedWord </term>A term tag may or may not be inside a hit tag, depending on whether the occurrence of the matched word is within the primary match for a snippet, or simply another occurrence within the context text.
<explanation value="Score" description="Description"> <explanation...> <explanation...> ... </explanation>where
Score | is a floating-point number calculated by XTF |
Description | is a brief, technical description of how this score value was calculated. |
<facet field = "FieldName" totalGroups = "NumberOfGroups" totalDocs = "NumberOfDocs"> GroupResult GroupResult … </facet>where
field="FieldName" | is the name of the meta-data field for which faceted data is being reported. |
totalGroups="NumberOfGroups" | is the number of groups groups this facet contains (which may be more than are selected and returned as GroupResults.) In the case of a hierarchical facet, this is actually a count of the top-level groups only. |
totalDocs="NumberOfDocs" | is the number of documents that had matches for the specified query and had a value for this facet. |
<group value = "GroupValue" rank = "GroupSortedRank" totalSubGroups = "NumOfSubGroups" totalDocs = "NumberOfDocs" startDoc = "FirstDocNumber" endDoc = "EndDocNumber"> GroupResult <!-- if facet is hierarchical --> GroupResult … DocumentHit <!-- if document hits were requested --> DocumentHit … </group>where
value="GroupValue" | is the specific facet value of the group being reported. One might also think of this as the "name" of the group. |
rank="GroupSortedRank" | is the ordinal ranking of this group within the set of groups at this level, with 1 being the first in sort order. Note that this is an absolute ranking for the group with respect to the entire set, and not a page relative ranking. For more information about paging groups, see the Group Selection section of the XTF Programmer's Guide. |
totalSubGroups="NumOfSubGroups" | is, for a hierarchical facet, the number of sub-groups this group contains, which may be more than were actually selected and returned. For a non-hierarchical facet, this will always be zero. |
totalDocs="NumberOfDocs" | is the number of documents that had matches for the specified main query and had GroupValue in the facet field. |
startDoc="FirstDocNumber" | is the sequential document number for the highest ranking document match reported for the current group. Note that this may not be the overall highest ranking document if a paged query was specified. See the Group Selection section of the XTF Programmer's Guide for more details about paging documents. |
endDoc="LastDocNumber" | is the sequential document number for the lowest ranking document match reported for the current group. Note that this may not be the overall lowest ranking document if a paged query was specified. See the Group Selection section of the XTF Programmer's Guide for more details about paging documents. |
<spelling> SpellingSuggestion SpellingSuggestion … </spelling>Within the Spelling Tag, one ore more Spelling Suggestion tags will appear, one for each potentially misspelled term in the original query submitted to the engine. Note that this tag will only appear if spell checking was enabled in the query, and a spelling correction dictionary was created at index time (i.e. enabled in textIndexer.conf), and the engine detected likely misspelled words and suitable suggested replacements.
<suggestion origTerm = "OriginalWord" suggestedTerm = "ReplacementWord"/>where
origTerm="OriginalWord" | is the (potentially) misspelled term found in the original query. |
suggestedTerm="ReplacementWord" | is the best correction that the spelling engine could find for the original term. This may be two words if the original word should be split (e.g. "harrypotter" -> "harry potter"). This may be an empty string, indicating that the original word should be removed from the query (e.g. "usa" "bility" -> "usability" "empty" |
<style> <source> <brand> <index> <query> Public Authentication IP List Authentication LDAP Authentication External Authentication
<style path="DocFormatterLocation"/>where
path="DocFormatterLocation" | is the file path and name for the Document Formatter stylesheet to use for the requested document. If this path is not specified as an absolute path, it is assumed to be relative to the base XTF installation directory (i.e., XTF_HOME.) |
<source path="SrcDocLocation"/>where
path="SrcDocLocation" | is the file path and name for the document to be retrieved. If this path is not specified as an absolute path, it is assumed to be relative to the base XTF installation directory (i.e., XTF_HOME.) |
<brand path="BrandStylesheetLocation"/>where
path="BrandStylesheetLocation" | is the file path and name for the brand file to use. If this path is not specified as an absolute path, it is assumed to be relative to the base XTF installation directory (i.e., XTF_HOME.) |
<name>value</name> <name>value</name> …These parameters are most often used to pass "branding" information to the Document Formatter stylesheet (e.g., background color to use, cascading stylesheet to use, font to use, etc.)
<index configPath="TextIndexerConfigLocation" name="IndexName"/>where
configPath="TextIndexerConfigLocation" | is the file path and name for the textIndexer config file. If this path is not specified as an absolute path, it is assumed to be relative to the base XTF installation directory (i.e., XTF_HOME.) |
name="IndexName" | is the name of the index to use. This index name must exist in the config file specified by the configPath attribute above. |
<query> crossQuery-Style Query Tags </query>where crossQuery-Style Query Tags are any of the query tags outlined in the Query Parser Output Tags section. Note however that the dynaXML query tag doesn't use the attributes available for the crossQuery query tag. Including a query in the document request allows the dynaXML servlet to mark query hits in context in the original document. Then, the Document Formatter stylesheet can opt to provide quick links to the hits or to simply highlight them in context. Note: If a query tag is used in a document request, the index tag must also be present in the document request. If the index tag is not present in the document request, an Unsupported Query error will be sent to the Error Generator stylesheet.
<auth access="AllowOrDeny" type="all"/>where
access="AllowOrDeny" | specifies whether all users should be allowed access (access="allow") or denied access (access="deny") to the requested document. |
<auth access="AllowOrDeny" type="IP" list="LocationOfIPList"/>where
access="AllowOrDeny" | specifies whether addresses in the IP list should be allowed access (access="allow") or denied access (access="deny") to the requested document. |
list="LocationOfIPList" | specifies the path and filename of a list of IP addresses to allow or deny access. If not specified as an absolute path, this path is assumed to be relative to the XTF base install directory (i.e., XTF_HOME.) To learn about the format of the IP List file, see the XTF Deployment Guide. |
<auth access = "allow" type = "LDAP" server = "LDAPServerURL" realm = "PswdRequestDescr" {bindName = "LDAPConnectName"} {bindPassword = "LDAPConnectPswd"} {queryName = "LDAPRecordNameToFind"} {matchField = "LDAPFieldToFind"} {matchValue = "LDAPValueToMatch"}/>where
server="LDAPServerURL" | identifies the location of the LDAP server to use. |
realm="PswdDialogDescr" | is a string to display in the browser dialog box that asks for the user's name and password. |
bindName="LDAPConnectName" | is an optional attribute specifying the name to use when connecting to the LDAP server. If this attribute is omitted, then an anonymous LDAP connection will be attempted. If anonymous connections are permitted by the LDAP database, then the bindPassword attribute should also be omitted, and the queryName attribute must be present for user authentication to proceed. For anonymous LDAP access the matchField and matchValue attributes are optional. If the name passed for this attribute is the LDAP administrator name, then the bindPassword attribute must be set to the LDAP administrator password, and the queryName must also be present for user authentication to proceed. For administrative LDAP access, the matchField and matchValue attributes are optional. It should also be noted that the user name will be substituted for any occurrence of the % symbol in this attribute. Doing so allows connections with the LDAP database to be established using the user name instead of an LDAP administrator name. Finally, if successfully connecting to the LDAP database with a user name and password is all that is required for authentication, then no other attributes need to be specified in the authentication tag. Otherwise, the queryName attribute and optionally the matchField and matchValue attributes may be specified to complete the authentication request. |
bindPassword="LDAPConnectPswd" | is an optional attribute specifying the password to use when connecting to the LDAP server. If an anonymous LDAP connection is being performed (i.e, the bindName attribute has not been specified), this attribute should also not appear in the authentication tag. If the bindName attribute specifies the LDAP administrator name, this attribute must be set to the LDAP administrator password. Finally, the user password will be substituted for any occurrence of the % symbol in this attribute. Doing so allows connections with the LDAP database to be established using the user password instead of an LDAP administrator password. |
queryName="LDAPRecordToFind" | is an attribute identifying the name of an LDAP record to find. If an anonymous or administrator connection to the LDAP server is being attempted, this attribute is required. For user connections, this attribute is optional. As with the bindName attribute, the user name will be substituted for any occurrence of the % symbol in this attribute. Doing so allows connections with the LDAP database to be established using the user name instead of an LDAP administrator name. Also, if the queryName attribute is specified without the matchField or matchValue attributes, then user authentication will succeed if the given record name simply exists in the LDAP database. If the given record is not in the LDAP database, authentication will fail. |
matchField="LDAPFieldToFind" | is an attribute identifying the name of a field to find in the LDAP record named by the queryName attribute. Note that the matchField attribute should not be used if the queryName attribute hasn't been specified. Like the queryName attribute, the user name will be substituted for any occurrence of the % symbol in this attribute. Doing so allows connections with the LDAP database to be established using the user name instead of an LDAP administrator name. Finally, if the matchField attribute is specified without the matchValue attribute, then user authentication will succeed if the given field name simply exists in the LDAP record. If the given field name does not exist in the LDAP database authentication will fail. |
matchValue="LDAPValueToMatch" | is an attribute that specifies the value that must exist in the LDAP field named by the matchField attribute for authentication to succeed. If the specified value doesn't match the LDAP field, user authentication will fail. As with the previous attributes, the user's password will be substituted for any occurrences of the % symbol. Doing so allows connections with the LDAP database to be established using the user password instead of an LDAP administrator password. |
<auth access = "allow" type = "external" key = "SecretKeyStr" url = "AuthenticationURL"/>Note: One or more auth tags must exist in the Document Request Parser stylesheet. These tags will be processed in the order they are encountered until one of them authorizes or denies access. If none of the authentication tags explicitly authorize or deny access, the dynaXML servlet will deny access by default. For more details about external authentication, see the XTF Deployment Guide.
xtf:hitCount="NumberOfHitsBelowThisTag" xtf:firstHit="FirstHitNumberBelowThisTag"These attributes are added to XML documents to indicate where matched text hits are located. By providing these tags, the dynaXML servlet allows the Document Formatter stylesheet to quickly determine if a section of a document needs any special highlighting or not. If the requested document has no hits, these attributes will appear once in the outermost tag for the document with both the attributes set to zero. If the document has one or more hits, these attributes will appear for any XML tag that has a hit inside it or inside its child tags.
<xtf:snippets> <xtf:snippet> <xtf:hit> <xtf:more> <xtf:term>
<xtf:snippets> Snippet Snippet … </xtf:snippets>where each Snippet is a dynaXML snippet tag that summarizes one query match in the requested document. Note: The <snippets> tag is prefixed with the xtf: namespace to differentiate it from tags that came from the original XML document.
<xtf:snippet hitNum="HitNumber" rank="MatchRelevanceRank" score="MatchRelevanceScore"> Hit Text (and context text, if any) </xtf:snippet>where
xtf: | is an XTF namespace prefix added to the tag to differentiate it from tags that came from the original XML document. The namespace URI for this prefix is: http://cdlib.org/xtf |
hitNum="HitNumber" | is the ordinal ID of the current hit. This attribute will also appear in hit tags in the main text, allowing the hit number for the next or previous in-context tag to be easily determined. |
rank="MatchRelevanceRank" | is the ranking of this match in the current document, with 1 being the most relevant match in the document. |
score="MatchRelevanceScore" | is the relevance score for this match ranging from 0% to 100%. The snippet with the highest overall relevance will receive a score of 100%, and less relevant snippets will receive lower scores. |
<xtf:hit hitNum = "HitNumber" rank = "MatchRelevanceRank" score = "MatchRelevanceScore" continues = "YesOrNo"> … </xtf:hit>where
xtf: | is an XTF namespace prefix added to the tag to differentiate it from tags that came from the original XML document. The prefix will not be present in hit tags that appear within the initial snippets summary tag, but only in hit tags that occur in the main text for the document. The namespace URI for this prefix is: http://cdlib.org/xtf |
hitNum="HitNumber" | is the ordinal ID of the current hit. This attribute allows the hit number for the next or previous in-context tag to be easily determined. |
rank="MatchRelevanceRank" | is the ranking of this match in the current document, with 1 being the most relevant match in the document. |
score="MatchRelevanceScore" | is the relevance score for this match ranging from 0% to 100%. The snippet with the highest overall relevance will receive a score of 100%, and less relevant snippets will receive lower scores. |
continues="YesOrNo" | indicates whether this hit continues into the next XML tag (continues="yes") or not (continues="no"). Note that when a hit continues into the next XML tag, a More Tag will always follow. |
<xtf:more hitNum = "HitNumber" rank = "MatchRelevanceRank" score = "MatchRelevanceScore" continues = "YesOrNo"> … </xtf:more>where
xtf: | is an XTF namespace prefix added to the tag to differentiate it from tags that came from the original XML document. The namespace URI for this prefix is: http://cdlib.org/xtf | |
hitNum="HitNumber" | is the ordinal ID of the hit to which this More Tag belongs. | |
rank="MatchRelevanceRank" | is the ranking of the hit to which this More Tag belongs. | |
score="MatchRelevanceScore" | is the relevance score of the hit to which this More Tag belongs. | |
continues="YesOrNo" | indicates whether the associated hit continues into yet another XML tag (continues="yes") or not (continues="no"). Note that when a hit continues into another XML tag, another more tag will always follow. |
<xtf:term>MatchedWord</xtf:term>where
xtf: | is an XTF namespace prefix added to the tag to differentiate it from tags that came from the original XML document. The prefix will not be present in term tags that appear within the initial snippets summary tag, but only in term tags that occur in the main text for the document. The namespace URI for this prefix is: http://cdlib.org/xtf |