Greenstone tutorial exercise
Setting up your Greenstone OAI Server
Greenstone 3 collections are available over OAI by default. Their collectionConfig.xml files already specify that each collection is OAI enabled, through use of an OAIPMH element. If you want to disable a collection from being accessible over OAI, edit the OAIPMH element in that collection's collectionConfig.xml. This tutorial will look at how to make an existing collection available over OAI and testing its accessibility by getting it validated against the Open Archives validator.
- Start up the Greenstone 3 Server by going to Windows Start → All Programs → Greenstone-3 → Greenstone3 Server.Press the Enter Library button and you will end up on your Digital Library home page as usual. Adjust the URL so that instead of the library suffix, it says oaiserver.The page that loads now will contain an error message (badVerb) saying that you've provided an illegal OAI verb. This is because the OAI specification requires you to provide more instruction in the URL as to what you want. The specification defines verbs and possible arguments to them.A basic verb is Identify, which requests the OAI server to return some information about the OAI repository that it's serving. Adjust the URL once more by suffixing ?verb=Identify, so that your URL now looks like:
Visiting this page now gives some information about your Greenstone OAI repository.
- Although the data transmitted over OAI is in the form of XML, Greenstone uses a stylesheet to transform that XML response into a user-friendly, structured web page that you see when you perform the Identify request (as happens when you visit the verb=Identify response page). This allows Identify and other verbs in the OAI specification to be shown in the main Greenstone OAI Server pages as link buttons. You can see these verbs represented in the main Greenstone oaiserver (or oaiserver?verb=Identify) page as a row of links, starting with "Identify" at the top and in the lower end of the page.Clicking on the links will execute that verb as a request and return the response from your Greenstone OAI server as a structured web page. Try clicking on all the links.
- OAI defines a concept called a Set. In Greenstone, the OAI Set concept is mapped to the practical Greenstone collection. The link to the ListSets verb will therefore request the Greenstone OAI server to list all the collections that have been enabled for OAI.Click on the ListSets link and have a look.The response page for the ListSets verb will show you that your backdrop collection (created in the Simple image collection tutorial) is one of the collections available over OAI in your Greenstone repository.
- You will see a couple of buttons next to each collection (or Set) listed here. The first is Identifiers and the second Records. Click on the Identifiers button for the backdrop Set. This will list all the IDs of the documents contained in your OAI collection.
- Click the browser Back button to get back to the ListSets page and press the Records button located next to the backdrop collection.If you had specified some Dublin Core (dc) metadata for each of the images in the backdrop collection, then the page that loads will display this information for each document in the collection (Set).Greenstone 3's OAI implementation uses the OAI standard for Dublin Core, oai_dc, metadata format. By default, it maps all Dublin Core metadata you may have assigned to your collections into oai_dc. This default mapping is specified in the web\WEB-INF\classes\OAIConfig.xml file. If all (or most) of your collections will be using a different metadata format, you can edit the OAIConfig.xml file's elementList section to create mappings from the metadata fields you're using to those in oai_dc. You can also specify mappings at a collection-level, overriding the mappings in OAIConfig.xml for that collection. So if a collection specifies metadata for a different metadata set format from the default mappings in OAIConfig.xml, adjust the collection's web\sites\localsite\collect\<collection-name>\etc\collectionConfig.xml file to tell Greenstone how to map the metadata fields of your chosen metadata set format into the oai_dc Dublin Core metadata set supported by the Greenstone OAI server.For instance, look in the demo collection's collectionConfig.xml file (web\sites\localsite\collect\lucene-jdbm-demo\etc\collectionConfig.xml) and scroll down to the definition for the OAIPMH ServiceRack. Look at its ListMetadataFormats section containing element mappings, which will explain and provide an example for how to specify such an oai mapping from the DLS metadata format that the demo collection uses, to the Dublin Core (oai_dc) metadata used by Greenstone's OAI server. Its dls.Organization metadata is mapped to oai_dc.publisher using the following line in the collectionConfig.xml configuration file (note the use of case):
Because the backdrop collection uses DC metadata, no mapping is required, as the default mappings from DC metadata to oai_dc are already specified in OAIConfig.xml.
<mapping elements="dls.Organization" />
Validating the Greenstone OAI server
In this section, you'll be testing that you've set up your Greenstone OAI server correctly so that it's accessible over OAI. For this part of the exercise, you need to be on a networked computer and your host computer needs to be visible to the outside world. (That is, when you provide the full name of your computer, someone else in the world should be able to find that computer by typing its URL into their browser's address field.)
We'll be using an external OAI client to access our up-and-running Greenstone OAI server. It's not just any OAI client either, but an OAI Server validator.
- We want the Greenstone library to be accessible to the Open Archives Validator, however URLs that use localhost can only be accessed locally. Therefore, if your Greenstone server runs on localhost (as it does by default), then you will need to edit the tomcat.server property of your Greenstone installation's top-level file build.properties and set this property to your domain name or your machine's IP address.
- For this exercise, we will be visiting the Open Archives Validator, for which your OAIserver needs to provide a valid email address. In a text editor, open up your Greenstone installation's resources/oai/OAIConfig.xml file. Set the value of the adminEmail element to the email address where the validation results are to be sent. Also set the OAI repositoryIdentifier element. The structure of its value is like a domain name and needs to be of the form of word-dot-extension, such as "greenstone.org". For more information on the structure of its value, see http://www.openarchives.org/OAI/2.0/guidelines-oai-identifier.htm. (If you wanted to additionally test the behaviour of the resumptionToken against the OAI Validator, you would set the resumeAfter element to a low value like 5).
- Start up the Greenstone 3 server application, or quit and relaunch it if it was running. Otherwise, go to Start → Greenstone → Greenstone3 Server to start up the server. When the library home page opens in your browser, change the library suffix in the URL to oaiserver, which is the baseURL of your OAI Server and would be of the form http://domain/greenstone3/oaiserver. Copy this URL and visit http://www.openarchives.org/Register/ValidateSite.
- The Open Archives Validator page will request the URL to your Greenstone OAI server. Paste the URL you have in your copy buffer into the field provided for this, and press the Validate baseURL button to start running the tests. You will be told to check the adminEmail address you provided to continue the remaining tests and to get the validation report.If the validator does not recognise the URL, make sure you have given the full domain of your host machine rather than just the host name. If that URL is still not accepted, visit the oaiserver.cgi?verb=Identify page again and check this works. If it doesn't, it may be that your machine is not set up to be accessible to outside networks. Check your proxy settings, make sure you've set up port forwarding and that your firewall is not interfering.