Greenstone tutorial exercise

Back to wiki
Back to index
Prerequisite: A collection of Word and PDF files
Devised for Greenstone version: 2.85|3.06
Modified for Greenstone version: 2.87|3.08

Associated files: combining different versions of the same document together

This tutorial demonstrates how to link different versions of the same document together in Greenstone. As an example, two identical articles about Greenstone are used; one is in PDF format, the other in Word.

  1. Start a new collection called Associated Files Example, by selecting File → New. Enter an appropriate description for your collection.

  1. Copy the files pdf01.pdf and word03.doc provided in sample_files → Word_and_PDF → Documents into your new collection. Do this by dragging these files across from the filesystem view on the left of the Gather panel into the Collection view on the right.

  1. In the collection view, right-click on each file and select Rename, renaming them greenstone1.pdf and greenstone1.doc, respectively.

  1. In the Enrich panel, assign appropriate dc.Title and dc.Creator metadata to the documents. Since the contents are identical, you can select both documents and set metadata for them simultaneously.

Associating one document with another

  1. In Document Plugins, select the WordPlugin and press the <Configure Plugin...> button. In the resulting popup, scroll down to find the associate_ext option, and set this option to pdf. Now, for Word documents, Greenstone will look for documents with the exact same name but the PDF file extension. These PDFs will not be processed separately; instead, they will be associated with their equivalent Word documents. (Alternatively, you could make the PDF document the primary document, by setting the associate_ext option in the PDFPlugin to doc.)

  1. Build the collection. Notice that only one document was considered for processing and included in the collection. Since the PDF version of the document is an associated document, it is not processed.

Linking to associated documents

  1. Greenstone has internally associated the PDF version with the Word version of the document. However, with the default format statement, the end-user will have no idea that the PDF version exists. The collection built at this point (with default settings) only gives the user the choice of viewing either the Word version or the Greenstone-generated HTML version of the document. They are not given the option to view the PDF version.

    To allow users to view the PDF version of the document,change the default VList statement from this:

    <td valign="top">[ex.srclink]{Or}{[ex.thumbicon],[ex.srcicon]}[ex./srclink]</td>


    <td valign="top">[ex.equivDocLink][ex.equivDocIcon][ex./equivDocLink]</td>

    Two things occur in this replacement. The main difference is the switch from using ex.srclink and ex.srcicon that provides the link to the primary source document (which is the Word document), and replace it with a hyperlink around an icon to the document that Greenstone has associated as an equivalent document (which is the PDF version). The icon Greenstone chooses to show is based on the filename extension of the matching file it has found. In this case .

    The second (more minor) change in this edit is to simplify the statement a bit. The original uses an {Or} statement to show a thumbnail version of the document, if Greenstone has one, in preference over the source icon. Since in this collection we have no thumbnails generated, it has been simplified by eliminating the {Or} combination and going straight to the ex.equivDocIcon metadata item.

    To make the change then, switch to the Format panel and edit the format statement for VList (All).


    <td valign="top">[link][icon][/link]</td>
    <td valign="top">[ex.srclink]{Or}{[ex.thumbicon],[ex.srcicon]}[ex./srclink]</td>
    <td valign="top">[highlight]


    <td valign="top">[link][icon][/link]</td>
    <td valign="top">[ex.equivDocLink][ex.equivDocIcon][ex./equivDocLink]</td>
    <td valign="top">[highlight]
    [/highlight]{If}{[dc.Creator],: [sibling(All'\, '):dc.Creator]}</td>

    Preview the collection.

    Note: When Greenstone encounters a file that matches the provided associate_ext value (pdf in our case), it sets the metadata value ex.equivDocIcon for that document to be the macro _iconXXX_, where XXX is whatever the filename extension is (so _iconpdf_ in our case). As long as there is an existing macro defined for that combination of the word icon and the filename extension, then a suitable icon will be displayed when the document appears in a VList. For pdf the displayed icon will be .

Copyright © 2005-2016 by the New Zealand Digital Library Project at the University of Waikato, New Zealand
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”