Greenstone tutorial exercise

Back to wiki
Back to index
Prerequisite: A collection of Word and PDF files
Devised for Greenstone version: 2.85|3.06
Modified for Greenstone version: 2.87|3.11

Associated files: combining different versions of the same document together

This tutorial demonstrates how to link different versions of the same document together in Greenstone. As an example, two identical articles about Greenstone are used; one is in PDF format, the other in Word.

  1. Start a new collection called Associated Files Example, by selecting File → New. Enter an appropriate description for your collection.

  1. Copy the files pdf01.pdf and word03.doc provided in sample_files → Word_and_PDF → Documents into your new collection. Do this by dragging these files across from the filesystem view on the left of the Gather panel into the Collection view on the right.

  1. In the collection view, right-click on each file and select Rename, renaming them greenstone1.pdf and greenstone1.doc, respectively.

  1. In the Enrich panel, assign appropriate dc.Title and dc.Creator metadata to the documents. Since the contents are identical, you can select both documents and set metadata for them simultaneously.

Associating one document with another

  1. In Document Plugins, select the WordPlugin and press the <Configure Plugin...> button. In the resulting popup, scroll down to find the associate_ext option, and set this option to pdf. Now, for Word documents, Greenstone will look for documents with the exact same name but the PDF file extension. These PDFs will not be processed separately; instead, they will be associated with their equivalent Word documents. (Alternatively, you could make the PDF document the primary document, by setting the associate_ext option in the PDFPlugin to doc.)

  1. Build the collection. Notice that only one document was considered for processing and included in the collection. Since the PDF version of the document is an associated document, it is not processed.

Linking to associated documents

  1. Greenstone has internally associated the PDF version with the Word version of the document. However, with the default format statement, the end-user will have no idea that the PDF version exists. The collection built at this point (with default settings) only gives the user the choice of viewing either the Word version or the Greenstone-generated HTML version of the document. They are not given the option to view the PDF version.

    To allow users to view the PDF version of the document, edit the documentNode template of the Browse Format Feature in the Format panel, to reference the equivDocIcon with a link to the PDF document equivDocLink as follows.

    Change:To:

    <gsf:template match="documentNode">
    <td valign="top">
    <gsf:link type="document">
    <gsf:icon type="document"/>
    </gsf:link>
    </td>
    <td valign="top">
    <gsf:link type="source">
    <gsf:choose-metadata>
    <gsf:metadata name="thumbicon"/>
    <gsf:metadata name="srcicon"/>
    </gsf:choose-metadata>
    </gsf:link>
    </td>





    <td valign="top">
    <gsf:link type="document">
    <!--
    Defined in the global format statement
    -->
    <xsl:call-template name="choose-title"/>
    <gsf:switch>
    <gsf:metadata name="Source"/>
    <gsf:when test="exists">
    <br/>
    <i>(<gsf:metadata name="Source"/>)</i>
    </gsf:when>
    </gsf:switch>
    </gsf:link>
    </td>
    </gsf:template>

    <gsf:template match="documentNode">
    <td valign="top">
    <gsf:link type="document">
    <gsf:icon type="document"/>
    </gsf:link>
    </td>
    <td valign="top">
    <gsf:link type="source">
    <gsf:choose-metadata>
    <gsf:metadata name="thumbicon"/>
    <gsf:metadata name="srcicon"/>
    </gsf:choose-metadata>
    </gsf:link>
    </td>
    <td valign="top">
    <gsf:metadata name="equivDocLink"/>
    <gsf:metadata name="equivDocIcon"/>
    <gsf:metadata name="/equivDocLink"/>
    </td>
    <td valign="top">
    <gsf:link type="document">
    <!--
    Defined in the global format statement
    -->
    <xsl:call-template name="choose-title"/>
    <gsf:switch>
    <gsf:metadata name="Source"/>
    <gsf:when test="exists">
    <br/>
    <i>(<gsf:metadata name="Source"/>)</i>
    </gsf:when>
    </gsf:switch>
    </gsf:link>
    </td>
    </gsf:template>


    The above change to the browse format statement, adds the equivalent document icon (a PDF icon in this case) next to the source icon (Word icon) for general classifiers. Preview the collection and browse the collection either by Titles or by Filenames.

    Preview the collection.

    Note: When Greenstone encounters a file that matches the provided associate_ext value (pdf in our case), it sets the metadata value ex.equivDocIcon for that document to be the macro _iconXXX_, where XXX is whatever the filename extension is (so _iconpdf_ in our case). As long as there is an existing macro defined for that combination of the word icon and the filename extension, then a suitable icon will be displayed when the document appears in a VList. For pdf the displayed icon will be .

  1. Go to Format Features → search and you will see:

    <gsf:template match="documentNode">
    <td valign="top">
    <gsf:link type="document">
    <Tab n="3"/><gsf:icon type="document"/>
    </gsf:link>
    </td>
    <td>
    <gsf:link type="document">
    <xsl:call-template name="choose-title"/>
    </gsf:link>
    </td>
    </gsf:template>

    The above will only display search results where there is a link to the Greenstone generated HTML version of the original source document, followed by the title of the document.

    Change the above to:

    <gsf:template match="documentNode">
    <td valign="top">
    <gsf:link type="document">
    <Tab n="3"/><gsf:icon type="document"/>
    </gsf:link>
    </td>

    <td valign="top">
    <gsf:link type="source">
    <gsf:choose-metadata>
    <gsf:metadata name="thumbicon"/>
    <gsf:metadata name="srcicon"/>
    </gsf:choose-metadata>
    </gsf:link>
    </td>
    <td valign="top">
    <gsf:metadata name="equivDocLink"/>
    <gsf:metadata name="equivDocIcon"/>
    <gsf:metadata name="/equivDocLink"/>
    </td>

    <td>
    <gsf:link type="document">
    <xsl:call-template name="choose-title"/>
    </gsf:link>
    </td>
    </gsf:template>

    Now, following the link to Greenstone's HTML document, there is a link to the source document (the doc file) and a link to its equivalent doc (the equivalent PDF file in our example).


Copyright © 2005-2019 by the New Zealand Digital Library Project at the University of Waikato, New Zealand
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”