Greenstone tutorial exercise
Building a small collection of HTML files
You will need some HTML files, such as those in the simple_html folder in sample_files.
Running the Greenstone Librarian Interface
- Start the Greenstone Librarian Interface:
Start → All Programs → Greenstone-2.87 → Librarian Interface (GLI)
If you are using Windows Vista or Windows 7 and have installed Greenstone into C:\Program Files\Greenstone, a User Account Control dialog may appear as you try to start the Greenstone Librarian Interface, click <Yes> to continue. After a short pause a startup screen appears, and then after a slightly longer pause the main Greenstone Librarian Interface appears. (A command prompt is also opened in the background.)
Starting a new collection
- Start a new collection within the Librarian Interface:
File → New...
- You will create a collection based on a few HTML web pages from the Tudor collection.A window pops up. Fill it out with appropriate values—for example,
Collection title: Small HTML Collection
Leave the setting for Base this collection on: at its default: -- New Collection --, and click <OK>.
Description of content: A small collection of HTML pages.
- Next you must gather together the files that will constitute the collection. A suitable set has been prepared ahead of time in sample_files → simple_html → html_files. Using the left-hand side of the Librarian Interface's Gather panel, interactively navigate to the sample_files → simple_html folder.
Adding documents to the collection
- Now drag the html_files folder from the left-hand side and drop it on the right. The progress bar at the bottom shows some activity. Gradually, duplicates of all the files will appear in the collection panel. A popup may appear saying that geov2.js is an unrecognised filetype and can't be processed by GLI. Tick the checkbox to no longer see this message again.
You can inspect the files that have been copied by double-clicking on the folder in the right-hand side.
- Since this is our first collection, we won't complicate matters by manually assigning metadata or altering the collection's design. Instead we rely on default behaviour. So pass directly to the Create panel by clicking its tab.
Building the collection
- To start building the collection, click the <Build Collection> button.
- Once the collection has built successfully, a window pops up to confirm this. Click <OK>.
- Click the <Preview Collection> button to look at the end result. This loads the relevant page into your web browser (starting it up if necessary).
Viewing the extracted metadata
- Back in the Librarian Interface, click the Enrich tab to view the metadata associated with the documents in the collection.
- Presently there is no manually assigned metadata, but the act of building the collection has extracted metadata from the documents. Double click the html_files folder to expand its content. Then single-click aragon.html to display all its metadata in the right-hand side of the panel. The initial fields, starting "dc.", are empty. These are Dublin Core metadata fields for manually entered data.
- Use the scroll bar on the extreme right to view the bottom part of the list. There you will see fields starting "ex." that express the extracted metadata: for example ex.Title, based on the text within the HTML Title tags, and ex.Language, the document's language (represented using the ISO standard 2-letter mnemonic) which Greenstone determines by analyzing the document's text.
- Close the collection by clicking File → Close. This automatically saves the collection to disk.
Viewing the internal links and external links
- Hyperlinks in a Greenstone collection work like this: If the link is to a document that is also in the collection, clicking it takes you to that document in the collection. If the link is to a document that is not in the collection, clicking it takes you to that document on the web.Go back to the web browser and click the titles link near the top of the page. Open the file boleyn.html and look for the link to Katharine of Aragon (in the 5th paragraph of the Biography section). This links to a document inside the collection--aragon.html. View this document by clicking the link. For an external link, return to boleyn.html and click letters written by Anne (in the Primary Sources section). This takes you out on to the web. If you want a warning message to be displayed first, you can open Greenstone → etc → main.cfg file and uncomment the line cgiarg shortname=el argdefault=prompt (remove the # at the start of a line to uncomment it). Note, that if you are already browsing a collection, then you will need to go back to the home page and re-enter the collection or even clear your browser history to see this take effect (due to caching of the el argument). Alternatively, try restarting the Greenstone web server.
Setting up a shortcut in the Librarian interface
- To set up a shortcut to the source files, in the Gather panel navigate to the folder in your local file space that contains the files you want to use—in our case, the sample_files folder. Select this folder and then right-click it, and choose Create Shortcut from the menu. In the Name field, enter the name you want the shortcut to have, or accept the default sample_files. Click <OK>. Close all the folders in the file tree in the left-hand pane, and you will see the shortcut to your source files.