Rowan Brownlee’s Beginner’s Guide to XTF

Media collections at the University of Sydney Library: Rowan Brownlee’s beginner’s guide to configuring XTF for presenting non-textual content and metadata

Rowan Brownlee, Digital Project Analyst, University of Sydney Library

XTF version 2.1.1

Last updated, 17 February 2010

The University of Sydney Library is using XTF to enable access to several media collections. The collections are publicly available, and this initiative results from partnerships between the Library, academics and associated project managers.

When I began working with XTF, I modified the default user interface to cater for presentation of a plant sciences image collection. I then sought to test application of the model across two additional collections within visual arts and archaeology. This section of the XTF wiki provides an outline of the steps I followed, using a collection of image and video reproductions of works created by staff and students at Sydney College of the Arts (SCA). For the SCA Archive, my project partner is Jacqui Spedding, a ceramic artist and project manager for SCA Images Online.

The presentation model described is not intended to suit all needs. It does however provide an example of adapting XTF stylesheets to suit non-textual media while capitalising on the strength and functionality of the framework.

Tutorial Components

About the Collections

Choosing XTF

I was working with botanic taxonomists and other librarians on a plant sciences digital media project titled eBot. We originally contracted a developer to create a PHP web form and a relational database enabling record creation, image submission, search, display and taxonomy management. Despite everyone’s best efforts, the results didn’t match the team’s expectations.

Seeking an alternative, I discovered that XTF was being used in April, an Australian Research Council project involving the Library and featuring Gary Browne (the Library’s development programmer). XTF will also be an important component of a planned upgrade to the Library’s text management platform. Although I wasn’t initially sure that XTF would meet the eBot project’s needs, there were a number of arguments in its favour. XTF is a ‘known’ technology backed by the Library, so there is a viable support model. Search and retrieval is very fast and configuration of presentation highly flexible. The default installation includes a number of example stylesheets suitable for adaptation for presentation of non-textual media such as images and video.

A key feature of XTF is its ability to cater for a wide range of metadata sets. My academic partners all have project- or domain-specific metadata. XTF requires no modification of source metadata and has readily accommodated metadata from the plant sciences, visual arts and archaeology. If an XSLT stylesheet can be written, source metadata can be presented.

XTF is not however a purpose-built media management system and we are investigating options for collection management and media processing. XTF also requires skills in related XML technologies such as XSLT and XPath. On balance, XTF is proving an excellent choice for our media search and presentation needs.

Next… Gathering content

Gathering content

The SCA media collection management system uses a Filemaker database which was initially developed by Anthony Green, (Visual Resources Librarian, Power Institute Visual Resources Library)for use within the Library. As part of an ongoing collaborative project, Anthony made the system available to SCA. Jacqui manages bibliographic metadata and related taxonomies for a number of research and teaching collections at SCA. For the XTF-based Archive, she exports records from Filemaker as a comma separated file (csv). I process the csv using a Python script to produce a set of XML records, ready for indexing by XTF.

Jacqui also provides sets of image and video files to be presented with their associated metadata. Using Photoshop batch processing techniques, Jacqui outputs thumbnail, view, powerpoint and zoomified versions from the original archival TIFF files. Flix provides Flash versions of video files. I use a Python script to check that each XML record is accompanied by its full complement of associated files and that each file is the correct format. (For web viewing of image files we need single-layered 8 bit per channel sRGB in jpeg format). Along with its excellent text processing tools, Python offers a very useful imaging library, and I’m considering additional process checks prior to submission of media and metadata.

Next… Installation and setup

Installation and setup

I followed the instructions in the XTF installation quick start to verify that initial setup was successful. I then deleted the sample data and created the following set of directories for metadata and media.

data/sca/records
media/sca/images/powerpoint
media/sca/images/thumbs
media/sca/images/view
media/sca/images/zoom
media/sca/video

Although the media files are currently located within the XTF directory structure, we will probably house them elsewhere on our storage area network.

Next… Telling XTF which records to index

Telling XTF which records to index

style/textIndexer/docSelector.xsl

The stylesheet docSelector.xsl identifies which files to index and which stylesheets to apply to each type of file.

Default docSelector.xsl

Given sufficient information to identify a category of file, XTF is able to apply the required stylesheets for data processing. The stylesheet docSelector.xsl includes a number of example tests enabling TextIndexer to identify a variety of XML, HTML and other files. In the following extract, one of the ways that XTF identifies an EAD XML file is to test if the string ead appears within a file’s root element.

usyd_docSelector2.png

Modified docSelector.xsl

Within the file template, I added a section describing SCA XML metadata files. Since all of the records contain sca_record within the root element, I used this as an identifying feature. I also included path references to scaPreFilter.xsl and scaDocFormatter.xsl. (More on these two files in identifying metadata elements for indexing and associating stylesheets with metadata records.)

usyd_docSelector4.png


Next… Identifying metadata elements for indexing and faceted browsing

Identifying metadata elements for indexing and faceted browsing

style/textIndexer/sca/scaPreFilter.xsl

scaPreFilter.xsl identifies metadata elements for indexing and for faceted browsing. I adapted the example nlmPreFilter.xsl, retaining the import common templates, output parameters, identify transformation and root template sections. I removed the NLM indexing section but kept the same kind of structure for the get-meta template. Most of the stylesheet comprises named templates, each of which target a particular metadata element. The syntactical requirements of metadata indexing templates are slightly different to those for facet templates, so I created two sets of templates – one to suit each type.

Calling named templates: identification of metadata elements

Extract from the get-meta template, listing calls to named templates targeting particular metadata elements for indexing.

Imported from wikispaces
Extract from get-meta template, listing calls to named templates targeting particular metadata elements for faceted browsing.

Imported from wikispaces

Example named templates for metadata indexing

Imported from wikispaces

Example named templates for faceted browsing

Imported from wikispaces

Next… Associating stylesheets with metadata records

Associating stylesheets with metadata records

style/dynaXml/docReqParser.xsl

XTF needs to know which stylesheets to apply to the SCA XML records, both for indexing as well as presentation as html pages. Within the root template, I include instructions to associate SCA-specific preFilter and docFormatter stylesheets with SCA XML records. As described in the previous section, scaPreFilter.xsl identifies metadata elements for indexing and faceted browsing. (scaDocFormatter.xsl is described in displaying full records).

The document root element of each XML file is tested. If it contains sca_record, a variable fileType is assigned the value sca. Statements within the style tag and preFilter tag test the value of fileType. If fileType = sca, scaPreFilter.xsl and scaDocFormatter.xsl will be applied to the file.

Testing for an SCA XML record

If the XML record contains sca_record within the root element, the fileType variable is assigned the value sca.

Imported from wikispaces

Associating scaPreFilter.xsl with SCA XML records (for indexing)

If the fileType variable contains the value sca, scaPreFilter.xsl will be applied to the file.
Imported from wikispaces

Associating scaDocFormatter.xsl with SCA XML records (for presentation as html pages)

If the fileType variable contains the value sca, scaDocFormatter.xsl will be applied to the file.
Imported from wikispaces

Next… Displaying brief search results

Displaying brief search results

style/crossQuery/resultFormatter/default/resultFormatter.xsl

For a brief record display showing an image thumbnail and metadata, most of the requirements are met by XTF’s default resultFormatter.xsl stylesheet.

Adding a column for the image thumbnails

Within the docHit template, the default XTF brief record display is structured as an html table. In the first row I inserted an additional cell between the record result number and the Artist metadata field. I made the cell a fixed width and included a rowspan attribute to ensure that all of the associated metadata fields would align to the right of the image.

<!-- Thumbnail image row -->
<td rowspan="20" width="250px" align "top" valign"middle">

Making the thumbnails clickable

For each retrieved record, XTF needs to know where to find the associated thumbnail image file. In the Local parameters section I added a parameter named thumbsImageFilePath.

<!-- sca thumbnail images-->

<xsl:param name="thumbsImageFilePath" select="concat($xtfURL, 'media/sca/images/thumbs')"/>

Within the docHit template, I used a variable fileName to reference the name of the thumbnail file associated with the retrieved metadata record. For the SCA metadata records, the filename is contained within the resource metadata field. By concatenating the fileName and thumbsImageFilePath within the variable imageFile, I provide a thumbnail reference specific to each retrieved record. Each time the template renders a brief record, it has sufficient information (contained within the imageFile variable) to find the location of that record’s thumbnail.

To make the thumbnail clickable, I copied the xsl already provided for the default clickable title field.

Imported from wikispaces

For each thumbnail, its target page is a full record display. For more information, see displaying full records.

Altering metadata field display labels

XTF’s default brief record layout includes an Author metadata label. In the example below, taken from the docHit template, I change the label to Artist. Note that the metadata tag is named creator (reflecting the metadata element tag definition contained in scaPreFilter.xsl. See identifying metadata elements for indexing and faceted browsing). Metadata display labels may be different than metadata tag names.

Imported from wikispaces

Screenshot: Default brief record display

Default brief record display showing EAD records.

Imported from wikispaces

Screenshot: Modified brief record display

Modified brief record display illustrating the addition of image thumbnails and SCA metadata elements. Note that facets are not yet displaying. Although metadata elements have been identified for faceted browsing (described in Identifying metadata elements for indexing and faceted browsing), additional work is required to produce facets (see sorting, browsing and facets).

Imported from wikispaces

In the above screenshot, Bookbag has changed to Citations. This is another simple change to display text, along the lines of altering a metadata field label.

Next… Displaying full records

Displaying full records

style/dynaXML/docFormatter/sca/scaDocFormatter.xsl

In associating stylesheets with metadata records, I describe how a relationship is signified between a particular stylesheet and its associated set of XML records. All SCA XML records are associated with the scaDocFormatter stylesheet. When a user clicks on a title hyperlink or image thumbnail within a brief record, XTF has sufficient information to identify the target XML file and its related stylesheet, applying the stylesheet to the XML to render html.

Defining parameters to save time and ease maintenance

The scaDocFormatter.xsl stylesheet is based on nlmDocFormatter.xsl included in the default installation, with the addition of a number of parameters containing information such as the location of types of image and video files. Using parameters provides a means of maintaining in one place information that might be used in multiple locations within a stylesheet. As an example, if I wish to reference the path to the location of the powerpoint versions of the image files, it is easier to use the shorthand $powerpointImageFilePath than writing out the complete directory path each time. Later if I decide to change the location of the files, I need only alter the location information once within the definition of the parameter.

Imported from wikispaces

Selecting a template matching requested layout

The root template in the default nlmDocFormatter stylesheet uses a choose element to provide options for dealing with requests to see various types of layout, such as table of contents, citation or print view. Each choice corresponds to a template which formats the content to match the desired view. At this stage I intend to include templates for print, citation, full record zoom (for display of zoomified versions of the images) and video (for presenting Flash files). I’m not using the frames template and the default view will be full record.

Default root template

Imported from wikispaces

Modified root template showing additional zoom and video options and default setting for full record.

Imported from wikispaces

Formatting a full record

Whenever scaDocFormatter.xsl is applied to an SCA XML record (such as when a user clicks on an image thumbnail in a brief record or its accompanying brief record title), the content is rendered using the full record template. An SCA full record comprises a number of elements such as artist, title, classification, culture, subject and copyright. For each element, the full record template tests whether the requested XML record contains the required information. In each case, if the information exists a field label is displayed along with the accompanying metadata.

Extract from the full record template illustrating the display of metadata elements dependent on their occurrence within an SCA XML record

Imported from wikispaces

Screenshot: Full record display

Imported from wikispaces

In the example, despite using a test to display a media field label only on occasions when accompanying metadata exists, the label is displaying against an apparently empty field. It could be the case that the source metadata record contains characters in the media field which although not displaying, pass the test by the fact of their existence. A more effective test may involve a simple regular expression to make field label display dependent on the occurance of alphabetic or numeric characters.

Incorporating additional navigation options on a full record page

The above screenshot includes no option to return to the previous page. This can be remedied by including a reference to XTF’s button bar. By default the button bar provides links to print and citation displays, return to search results and home. The template can be called using the following instruction.

<xsl:call-template name="bbar"> </xsl:call-template>

Although the template is included in a separate stylesheet, it can be called from within scaDocformatter.xsl, as it has been imported.

<xsl:import href="../common/docFormatterCommon.xsl"/>

Screenshot including reference to the button bar template (bbar).

Imported from wikispaces

Screenshot: Full record with button-bar navigation options

The following image was captured after the site was styled

Imported from wikispaces

By default, XTF’s button bar includes a search box (for searching within the currently displayed text). As this isn’t required for the SCA collection, it is commented out from the template (/style/dynaXML/docFormatter/Common/docFormatterCommon.xsl).

Generating hyperlinks to other types of images

The screenshot above includes hyperlinks to powerpoint, web and zoomified image displays. Each image filename comprises a record number and a file extension. (eg. sca3501-1.jpg). As described in installation and setup, each set of images is contained in a separate directory. To create the hyperlink targets, the full record template concatenates the directory path, record number and file extension (each of which is specified as a parameter, as described earlier in this section).

The hyperlink references a variable containing information identifying the record currently being displayed, and the desired view option.

<xsl:variable name="zoom.href">
  <xsl:value-of select="$xtfURL"/>view?<xsl:value-of select="$query.string"/>;doc.view=zoom
</xsl:variable>
<a href="{$zoom.href}">zoom</a><xsl:text> a full-sized image.</xsl:text>

Extract from full record template illustrating the creation of hyperlink targets for display of powerpoint, web and zoomified images.

Imported from wikispaces

Using the above technique, clicking on a hyperlink for a zoomified image displays a static html page pre-generated using photoshop’s zoomify export. A more flexible approach is to use a template to generate the html wrapper for the zoomified image via XTF. This enables dynamic incorporation of image metadata within the page. (See zoomable image display).

Generating hyperlinks to video

Although every record is associated with a common set of images (including thumb and zoom), only a small number describe video files. Metadata records contain information about available format types (such as digital image and video), and I use the occurrence of this information to control hyperlink display. Catering for conditional inclusion of video hyperlinks involves a slight modification of the instructions covered in the previous section.

Within the full record template, a hyperlink references a variable containing information identifying the record currently being displayed, and the desired view option.

<xsl:variable name="videoTest" select="/sca_record/video_available"></xsl:variable>

<xsl:variable name="video.href">
  <xsl:value-of select="$xtfURL"/>view?<xsl:value-of select="$query.string"/>;doc.view=video

</xsl:variable>

Extract from full record template illustrating the use of a test for conditional display of video playback hyperlinks.

Imported from wikispaces

Including header and footer information

In the full record screenshot above, the header is the blue section along the top of the page. The header and footer are displayed by including within the full record template references to $brand.header and $brand.footer.

<xsl:copy-of select="$brand.header"/>
<xsl:copy-of select="$brand.footer"/>

The header and footer are defined within brand\default.xml. (For more information, see styling the site).

Next… Zoomable image display

Zoomable image display

style/dynaXML/docFormatter/sca/scaDocFormatter.xsl

Layout for a page displaying a zoomable image is similar to that for a print view, with the addition of header and footer information. (For information about including branding and header/footer content, see styling the site).

Screenshot: Full record with zoom hyperlink

The option to zoom a full-sized image is included as a hyperlink on the full record display page (blue hyperlink just under the Full record heading).

Imported from wikispaces

Configuring zoom hyperlink

The hyperlink references a variable containing information identifying the record currently being displayed, and the desired view option.

Imported from wikispaces

<a href=”{$zoom.href}”>zoom</a><xsl:text> a full-sized image.</xsl:text>

Zoom template

Clicking on the link supplies this information to scaDocFormatter.xsl, which calls the specified document view template. The zoom template relates the current record with the path to the zoomified image, and holds this information in a variable called zoomifyImage. (XTF knows to use scaDocFormatter.xsl because of an association previously defined within docSelector.xsl).

The parameter zoomifyImage (defined earlier) is referenced by the section of the template responsible for configuring and displaying the flash player used to render and navigate the zoomable image.

Imported from wikispaces

The zoom template acts as a dynamic and flexible option for presenting zoomable images wrapped in associated metadata. A customisable (though static) html presentation for zoomable images can alternately be generated through Photoshop.

Screenshot: Full record with zoomified image

Imported from wikispaces

Next… Video playback

Video playback

style/dynaXML/docFormatter/sca/scaDocFormatter.xsl

Layout for a page including a video is similar to that for a zoomable image view, with both presenting content using Flash.

Testing whether a metadata record describes a video

Although every record is associated with a common set of images (including thumb and zoom), only a small number describe video files. Metadata records contain information about available format types (such as digital image and video), and I use the occurrence of this information to control hyperlink display. If a record is accompanied by a video, a playback hyperlink displays on the full record display page (blue hyperlink just under the Full record heading).

Screenshot: Full record with video hyperlink

Imported from wikispaces

Screenshot: Full record without video hyperlink

Imported from wikispaces

Configuring video hyperlink

Within the full record template, a hyperlink references a variable containing information identifying the record currently being displayed, and the desired view option.

<xsl:variable name="videoTest" select="/sca_record/video_available"></xsl:variable>

Imported from wikispaces

The following instructions are similar to those previously used to generate hyperlinks to enable access to various forms of an image (such as zoom or powerpoint). The only difference in this case is the addition of a conditional statement to ensure that a link for a video playback page will only display if the metadata indicates availability of a video file.

Imported from wikispaces

Video template

When the video hyperlink is clicked, scaDocFormatter.xsl calls the video document view template. The video template relates the current record with the path to the video, and holds this information in a parameter called flashVideo. (XTF knows to use scaDocFormatter.xsl because of an association previously defined within docSelector.xsl).

The parameter flashVideo (defined earlier) is referenced by the section of the template responsible for configuring and displaying the flash video player.

Imported from wikispaces

Screenshot: Video playback

Imported from wikispaces

Next… Video playback part two: Offering video for high, medium and low-speed connections

Video playback part two: Offering video for high-, medium- and low-speed connections

style/dynaXML/docFormatter/sca/scaDocFormatter.xsl

I have Flash video suited for 1000k, 512k and 56k connections (via Flix). I use a named template with parameters to handle requests for the different video resolutions. By using parameters, I avoid the need to repeat almost identical templates, and this assists maintenance.

Configuring video hyperlinks

This is similar to the setup in the previous section, though now I have several variables.

Imported from wikispaces

Slight change to the test used to determine which hyperlinks will display.

Imported from wikispaces

Video display options within the root template

The root template now contains instructions for dealing with requests to view high-, medium- and low-resolution video files.

Imported from wikispaces

Video template

The video template includes parameters specifying the path to the video file, and player dimensions. This information is passed by the calling template.

Imported from wikispaces

These parameters are used within the section of the video template responsible for setting up the Flash player

Imported from wikispaces

Screenshot: Full record with video hyperlinks

Imported from wikispaces

Next… Displaying records formatted for printing

Displaying records formatted for printing

style/dynaXML/docFormatter/sca/scaDocFormatter.xsl

Apart from removing branding and altering the metadata display to follow below the image, the template responsible for producing a print view is very similar to a full record display. (For information about including branding and header/footer content, see styling the site).

Imported from wikispaces

As shown in the screenshot, the print view layout includes the URL for the full record. The URL is produced by adding the following instruction to the print template.

Imported from wikispaces

Next… Sorting, browsing and facets

Sorting, browsing and facets

Following on from previous steps it is at this stage possible (if textIndexer has been run), to search metadata and retrieve and display result sets rendered as brief records with accompanying thumbnail images. The thumbnails are clickable through to full records, and from full-record display other types of media are viewable (such as powerpoint and zoomable images and Flash video files).

Although in a previous section I identified metadata for use in facets, other steps are required to produce facets. For the SCA collection, this involves working with preFilterCommon, queryParser, resultFormatterCommon, and resultFormatter.

After making the changes described in the hyperlinked pages above, a typical search result page displays as follows.Imported from wikispaces

In the above example, an underscore appears in the facet named Special_collection. Facet titles are defined in scaPreFilter.xsl (extract follows).

Imported from wikispaces

Facet title display may be altered by overriding a template within resultFormatterCommon.xsl.Imported from wikispaces

In this version of the template, an underscore is replaced by a single space.

Imported from wikispaces

Example results page showing the Special collection facet title without an underscore.

Imported from wikispaces

Next… Sorting, browsing and facets: preFilterCommon

Sorting, browsing and facets: preFilterCommon

style/textIndexer/common/preFilterCommon.xsl

Enabling facet, browse and sort fields

The XTF SCA site includes sort options for title and artist (creator), browse-by options for facet, title and artist, and additional facets for classification, culture, degree, special collection, affiliation and studio. The default version of preFilterCommon.xsl contains a number of sort, facet and browse options setup for elements including title, creator, and date, and these can be copied and adapted to support other facets based on other elements.

Default sort, browse and facet optionsImported from wikispaces

SCA sort, browse and facet options

Imported from wikispaces

Generating sorting

Each sort option must be generated. The SCA collection sorts on title and creator (artist). The default preFilterCommon.xsl provides examples which may be copied and modified to enable sorting on other fields.

Imported from wikispaces

In the extract above, parsing instructions for title and creator are different as they are processed quite differently.

Generating browsing

Each browse option must be generated. The SCA collection supports browsing on title and artist (creator). Examples available in the default preFilterCommon.xsl are adaptable to suit other fields.

Imported from wikispaces

As with sorting, the parsing instructions are different for names and titles.

Generating facets

Each facet option must be generated. The following screen image shows facets for subject, classification, culture, degree and affiliation. Although most are specific to SCA, they are based on examples provided within the stylesheet.

Imported from wikispaces

Next… Sorting, browsing and facets: queryParser

Sorting, browsing and facets: queryParser

style/crossQuery/queryParser/default/queryParser.xsl

Sorting

The default queryParser sorts by title, year, reverse year, creator and publisher. The SCA collection is sortable by creator (artist) and title, so the other options are removed, as illustrated below.

Default sort attribute

Imported from wikispaces

Modified sort attribute

Imported from wikispaces

Browsing

By default, queryParser is configured to support browsing by title and creator, and this meets the needs of the SCA collection. For XTF 2.1.1, an alteration to the default stylesheet is required to enable alphabetic sorting within browse sets. queryParser correctly sorts alphabetic divisions. All of the titles starting with A are gathered within an A group, and they are all displayed prior to those beginning with B (which are similarly correctly grouped within a B division). The default browse does not however include an instruction to ensure correct sorting within each of these divisions. The sequence of displayed titles within an alphabetic division corresponds to the order in which the XML records are indexed. A solution (provided via the XTF users list), is illustrated below.

Default browsing

Imported from wikispaces

Modified browsing

Imported from wikispaces

Facets

queryParser includes options for configuring how facets will be grouped at the time of page display. In the following example, I specify that no more than the first five facets in each category will display, and that they be alphabetically ordered (i.e. sorted by value).

Imported from wikispaces

In a previous section I describe how to identify metadata for faceted browsing by XTF and the associated steps involved in generating facets.

Next… Sorting, browsing and facets: resultFormatterCommon

Sorting, browsing and facets: resultFormatterCommon

style/crossQuery/resultFormatter/common/resultFormatterCommon.xsl

Metadata parameters

Within the parameters section, I include entries for SCA metadata elements, using the same format provided for Dublin Core elements.

SCA metadata element parameters (extract)

Imported from wikispaces

Alpha browse parameters

Within the alpha browse parameters section, there are already entries for all, title and creator. No additional browse options are needed for the SCA collection.

Imported from wikispaces

Sort options template

The sort options template controls display of the drop-down sort options which may be applied to retrieved records. SCA sorts on relevance, title and artist (which are available in the default stylesheet), so I remove the other options.

Default sort options template

Imported from wikispaces

Modified sort options template

Imported from wikispaces

Alpha list template

Within the alphaList template, there are already browse-name and browse-value options for title and creator (artist). This suits the SCA collection.

Imported from wikispaces

Next… Sorting, browsing and facets: resultFormatter

Sorting, browsing and facets: resultFormatter

style/crossQuery/resultFormatter/default/resultFormatter.xsl

In an earlier section I describe changes to resultFormatter to enable display of brief records with accompanying clickable thumbnail images. The current section covers browsing by facets.

In the SCA example, very few changes are needed, as the required sort and browse options are catered for by the default version of resultFormatter. The examples are however readily adaptable for other cases. (Within a plant sciences image collection named eBot, I configured browse and sort options for title, creator, family, genus and species).

Root template

Within the root template, there are options to $browse-title and $browse-creator

Imported from wikispaces

Browse template

Within the browse by section of the browse template, there are entries for $browse-title and $browse-creator

Imported from wikispaces

Within the results section of the browse template, there are entries for browse-title and browse-creator

Imported from wikispaces
Browse links template

Within the browseLinks template, there are entries for browse-title and browse-creator. The only change involves altering display text to read Artist rather than Author.

Imported from wikispaces

Results template

The results section of the results template lists the facets which will display. (Configuration and generation of facets is handled elsewhere)

Imported from wikispaces

Next… Hierarchical facets

Hierarchical facets

Hierarchical facets provide a means for expressing nested relationships within classification schemes and taxonomies and enable people to browse a taxonomic pathway through a collection. They also provide a way to efficiently use screen-space, as illustrated in the screenshots below. This is particularly the case with the eBot plant sciences collection which is underpinned by a deep taxonomy.

**Screenshots on this page were taken following site styling.

Expressing hierarchical relationships

Apart from the steps outlined in the previous section, XTF needs to be provided with an expression of the relationships between the various levels within the hierarchical structure. Relationships are indicated using ‘::’ notation, (as explained in the XTF documentation on the subject).

Extract from an SCA XML record showing hierarchical facet notation

I followed the example provided in exercise 8 of Martin Haye’s tutorial. As part of the pre-processing of metadata (not covered in this discussion), I use Python scripting to create an additional metadata field in each record.

Imported from wikispaces

Screenshots: SCA archive, top-, first-, second- and third-level facet navigation

Top level classification facet.

Imported from wikispaces

First level

Imported from wikispaces

Second level

Imported from wikispaces

Third level

Imported from wikispaces

Screenshot: Fish-bone image collection, top-level taxonomy facet

Imported from wikispaces

Screenshot: eBot plant sciences collection, expanded taxonomy pathway

Example illustrating partial navigation of the higher level taxonomy of the eBot plant sciences collection.

Imported from wikispaces

Next… Queries and search forms

Queries and search forms

Defining a simple keyword search

style/crossQuery/queryParser/default/queryParser.xsl

XTF needs to know which fields to target for a simple keyword search. Field names are specified within a parameter.

Imported from wikispaces

Defining search forms

style/crossQuery/resultFormatter/default/searchForms.xsl

XTF’s default user interface includes keyword, advanced, freeform and browse options. As Jacqui and I have not yet designed an advanced search, I remove it from display, along with the freeform option.

Search options for advanced and freeform are commented out.

Imported from wikispaces

Altering example search text to reflect SCA subject matter

Imported from wikispaces

Screenshot: Modified search page

Imported from wikispaces

Next… Advanced search

Advanced search

style/crossQuery/resultFormatter/default/searchForms.xsl

As mentioned in the previous section, Jacqui and I have not yet designed an advanced search for the SCA collection, so this section instead describes the advanced search page for the eBot plant sciences collection.

Imported from wikispaces

Input fields

Apart from removing from display some of the default search options, I setup input fields to enable targeted search across particular metadata elements. The example below illustrates description, family, genus, species and common-name fields. In each case value=”{$abc}” refers to the name of the target metadata element. Metadata element names are defined in an indexing preFilter. (For the SCA collection, scaPrefilter.xsl is described in an earlier section.)

Imported from wikispaces

Dropdown lists

The eBot advanced search form uses several dropdown lists. The following example is an extract from image type. eBot contains a number of types including herbarium, plant and micrograph.

Imported from wikispaces

Making dropdown selections sticky

By default, dropdown list selections are not sticky. If I choose an option from a dropdown list, perform a search and click ‘modify search’ from the results page to return to the advanced search page my previously selected dropdown choice will not display. Instead the default option at the head of the dropdown list displays. Selections can be made sticky by including an instruction provided elsewhere by XTF.

Imported from wikispaces

Future enhancements

The current advanced search is quite simple and could be developed to be far more extensive and flexible. I’d like to provide dropdown lists or automatic completion for fields such as family, genus and species. These classification elements are already related to each other through an extensive taxonomy underpinning the eBot collection, and the relationships might be exploited to provide additional support for searching. As an example, if a user wishes to search on a particular species, selection from the species category should trigger auto-population of the the family and genus fields. It would also be useful to enable the boolean ‘OR’ within categories such as image type. This would enable multiple selections from the list, to suit cases where people are interested in results comprising several image types such as herbarium OR micrograph.

Next… Styling the site

Styling the site

brand/default.xml

Modified header and footer

Header and footer information is contained in brand/default.xml which I alter to provide pointers to styling elements more reflective of the University of Sydney corporate style.

Imported from wikispaces

Changes to css

As css changes are specific to the University of Sydney context, I haven’t included details.

Referencing header, footer and css within templates

Because css, header and footer information is contained within brand/default.xml, it is sufficient to provide references within templates, in the following format.

<xsl:copy-of select="$brand.links"/>
<xsl:copy-of select="$brand.header"/>

Results template including references to $brand.links (for css) and $brand.header:

Imported from wikispaces

Screenshot: Styled search page

Imported from wikispaces

Screenshot: Styled browse-all by facet page

Imported from wikispaces

Screenshot: Styled full-record page

Imported from wikispaces

Next… Displaying error messages

Displaying error messages

Presentation of error messages is handled by errorGen.xsl. There are two files, one each for index errors and query errors. They are located at

style\dynaXML\errorGen.xsl
style\crossQuery\errorGen.xsl

Screenshot: Default error display

Imported from wikispaces

Screenshot: Customised presentation of error messages

This example shows an error message within a navigation model used throughout the site.

Imported from wikispaces

Referencing site styling elements

Apart from altering the order of displayed sections and editing contact details, I add references to styling elements.

Imported from wikispaces

Modifying the error presentation template

Styling elements are referenced within a modified template.

Imported from wikispaces

Next… About the collections

About the collections

The University of Sydney Library is using an XTF presentation model to enable online access to several media collections in the fields of plant sciences, visual arts and archaeology. eBot was released for public access in October 2009, and the Archaeology collection followed in November. The SCA collection was launched in February 2010. This initiative results from partnerships between the Library, academics and associated project managers.

  • SCA Archive
    The SCA Archive contains image and video reproduction of works created by visual arts academics and students at Sydney College of the Arts. The key project partner is Jacqui Spedding, project manager for SCA Images Online. The SCA Archive comprises 1509 items.
    Screenshots | Access the SCA Archive

  • eBot
    eBot is a collection of plant sciences images representing the work or University of Sydney academics for use in research, learning and teaching programs. Key project partners include Murray Henwood and Rosanne Quinnell (School of Biological Sciences) and Su Hanfling (Library). More information about eBot, from the proceedings of the 2008 Uniserve Conference. eBot comprises 1856 images.

    Screenshots | Access eBot

  • Sarah Colley’s fish-bone image collection
    Sarah Colley’s archaeological collection contains images of fish-bones and provides support for her research and that of her project partners. One of Sarah’s research interests concerns the archaeological study of fish and fishing in Sydney before and after 1788. The collection comprises 812 images.

    Screenshots | Acess Sarah Colley’s fish-bone image collection

Screenshots and Details for Each Collection

SCA Archive

The SCA Archive contains image and video reproduction of works created by visual arts academics and students at Sydney College of the Arts. The key project partner is Jacqui Spedding, project manager for SCA Images Online. The SCA Archive comprises 1509 items.

Screenshots

Screenshots from the SCA Archive

Collection home page

A static html page providing an entry point to the collection. Useful for promotion and marketing.

Imported from wikispaces

About the collection

A static html page with background information.

Imported from wikispaces

Search page

Imported from wikispaces

Browse page

Imported from wikispaces

Browse all by facet

Imported from wikispaces

Full record with image display options

These display options are visible if the record describes an image. (Options appear below the red Full record heading.)

Imported from wikispaces

Full record with zoomable image

Imported from wikispaces

Full record with image and video display options

These display options are visible if the record describes a video. (Options appear below the red Full record heading.)

Imported from wikispaces

Full record with video playback

Imported from wikispaces

Full record print view

Imported from wikispaces

eBot Archive

eBot is a collection of plant sciences images representing the work or University of Sydney academics for use in research, learning and teaching programs. Key project partners include Murray Henwood and Rosanne Quinnell (School of Biological Sciences) and Su Hanfling (Library). More information about eBot, from the proceedings of the 2008 Uniserve Conference. eBot comprises 1856 images.

Screenshots from theeBot archive

Search page

Imported from wikispaces

Browse page

Imported from wikispaces

Browse all by facet

Imported from wikispaces

Full record

Imported from wikispaces

Imported from wikispaces

Full record with zoomable image

Imported from wikispaces

Full record print view

Imported from wikispaces

Sarah Colley’s fish-bone image collection

Sarah Colley’s archaeological collection contains images of fish-bones and provides support for her research and that of her project partners. One of Sarah’s research interests concerns the archaeological study of fish and fishing in Sydney before and after 1788. The collection comprises 812 images.

Screenshots from Sarah Colley’s fish-bone collection

Sarah Colley’s archaeological collection contains images of fish-bones and provides support for her research and that of her project partners. One of Sarah’s research interests concerns the archaeological study of fish and fishing in Sydney before and after 1788. At this time the collection is available for use by Sarah’s project partners, though in future it will become openly accessible. The collection comprises 487 images.

Collection home page

A static html page providing an entry point to the collection. Useful for promotion and marketing.

Imported from wikispaces

About the collection

A static html page with background information.

Imported from wikispaces

Search page

Imported from wikispaces

Browse page

Imported from wikispaces

Browse all by facet

Imported from wikispaces

Full record

Imported from wikispaces

Full record with zoomable image

Imported from wikispaces

Full record print view

Imported from wikispaces

Next steps and future developments

Harvesting and robots

This isn’t something I’ve investigated at all, though I understand that XTF provides support for robots and OAI harvesting. I’d like to enable harvesting and discovery by Google, while ensuring that media is viewed in the context of the XTF presentation.

Advanced search

In the advanced search section I mention a couple of eBot future enhancements which are applicable to presentation of the other collections. The current eBot advanced search is quite simple and could be developed to be far more extensive and flexible. I’d like to provide dropdown lists or automatic completion for fields that employ controlled vocabularies (such as family, genus and species). These classification elements are already related to each other through an extensive taxonomy underpinning the eBot collection, and the relationships might be exploited to provide additional support for searching. As an example, if a user wishes to search on a particular species, selection from the species category should trigger auto-population of the the family and genus fields. It would also be useful to enable the boolean ‘OR’ within categories such as image type. This would enable multiple selections from the list, to suit cases where people are interested in results comprising several image types such as herbarium OR micrograph.

Content management and relationships between content management & presentation

Following on from the advanced search comments above, I’d like to draw in content for dropdown lists and taxonomies from other sources. For each collection, metadata and taxonomies are maintained elsewhere by the content creators. At this stage I have a semi-automated approach to metadata and media import (for indexing and presentation), but I’m not yet importing controlled vocabularies for presentation on the search pages. For these three target collections, I’d like to automate as much as possible the process of content transfer.

Facet paging

On a brief record display screen, it’s useful to be able to choose to display the first ‘X’ number of facets within a category (eg top 5). Having scanned the first 5 or 10, I’d like to be able to ‘page’ through successive sets, as an option to viewing all of the remaining facets in one hit. This may be useful in cases where there is an extensive facet list and where nesting of facets is not an option.

PDFs and powerpoints

It may be useful to enable generation of pdfs and powerpoints combining an image with its associated metadata.

Packaging and downloading images and related metadata

As an additional option to citation emailing (i.e. bookbag / my citations), it may be useful to enable downloading of packages of selected media and associated metadata.

Other…

Other activities will include investigation of alternate media presentation models and learning more about XML, XSL, text processing techniques and ways of ensuring common display across multiple browsers (eg via the Yahoo User Interface Library).

About the author

My name is Rowan Brownlee and I work in the eScholarship Division of the University of Sydney Library . As Digital Project Analyst I participate in activities supporting research, learning and teaching. Lead by Ross Coleman, eScholarship is a relatively new initiative which includes a digital archive and Sydney University Press. The Library also has considerable experience in text encoding through the work of Creagh Cole within SETIS (Sydney Electronic Text and Imaging Service).

I have been exploring the use of XTF for searching and presenting media collections in the subject areas of plant sciences, archaeology and visual arts. Although an XSLT beginner, the outcomes have been encouraging. I am interested in helping introduce others to XTF, so I have described my application of XTF to the SCA Archive, a collection of image and video reproductions of works created by staff and students at Sydney College of the Arts. My project partner is Jacqui Spedding, a ceramic artist and project manager for SCA Images Online.