Feature #8159

Improve OAI-PMH implementation of dublin core (oai_dc)

Added by Dan Gillean about 7 years ago. Updated about 7 years ago.

Status:VerifiedStart date:03/27/2015
Priority:MediumDue date:
Assignee:Dan Gillean% Done:

0%

Category:OAI-PMH
Target version:Release 2.2.0
Google Code Legacy ID: Tested version:2.2
Sponsored:No Requires documentation:

Description

Currently, AtoM's output of oai_dc metadata during a listRecords request looks something like this, for an individual record within the request:

<record>
    <header>
      <identifier>oai:example-site.com:repocode_666</identifier>
      <datestamp>2010-06-14T05:25:50Z</datestamp>
      <setSpec>oai:oai:example-site.com:repocode_111</setSpec>
    </header>
    <metadata>
      <oai_dc:dc xmlns="http://purl.org/dc/elements/1.1/" 
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
        <title>Syllabus of lectures on &#039;Cities in Evolution&#039;</title>
        <description>An introductory course of general sociology. University of Bombay.</description>
        <date>1919</date>
        <format>1 item</format>
        <identifier>http://example-site.com/syllabus-of-lectures-on-cities-in-evolution</identifier>
        <identifier>5</identifier>
        <source></source>
        <language xsi:type="dcterms:ISO639-3">eng</language>
        <rights>Open</rights>
      </oai_dc:dc>
    </metadata>
  </record>

However, this is not how the OAI standard suggests that DC tags be implemented - instead, the dc: prefix should be used. You can see this in the oai_dc schema... (http://www.openarchives.org/OAI/2.0/oai_dc.xsd)

<element name="dc" type="oai_dc:oai_dcType"/>

<complexType name="oai_dcType">
  <choice minOccurs="0" maxOccurs="unbounded">
    <element ref="dc:title"/>
    <element ref="dc:creator"/>
    <element ref="dc:subject"/>
    <element ref="dc:description"/>
    <element ref="dc:publisher"/>
    <element ref="dc:contributor"/>
    <element ref="dc:date"/>
    <element ref="dc:type"/>
    <element ref="dc:format"/>
    <element ref="dc:identifier"/>
    <element ref="dc:source"/>
    <element ref="dc:language"/>
    <element ref="dc:relation"/>
    <element ref="dc:coverage"/>
    <element ref="dc:rights"/>
  </choice>
</complexType>

... and you can see it in the examples provided in the documentation:

<?xml version="1.0" encoding="UTF-8"?>
<oai_dc:dc 
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" 
    xmlns:dc="http://purl.org/dc/elements/1.1/" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
    http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:title xml:lang="en">Grassmann's space analysis</dc:title>
  <dc:creator>Hyde, E. W. (Edward Wyllys)</dc:creator>
  <dc:subject>LCSH:Ausdehnungslehre; LCCN QA205.H99</dc:subject>
  <dc:publisher>J. Wiley &amp; Sons</dc:publisher>
  <dc:date>Created: 1906; Available: 1991</dc:date>
  <dc:type>text</dc:type>
  <dc:identifier>http://resolver.library.cornell.edu/math/1796949
     </dc:identifier>
  <dc:language>english</dc:language>
  <dc:rights xml:lang="en">Public Domain</dc:rights>
</oai_dc:dc>

We can also see this in several other OAI implementations, such as:

This feature request would see us revise the OAI implementation in AtoM to better conform to standards-based practice, making the output of an OAI request response more resuable.

History

#1 Updated by Dan Gillean about 7 years ago

See the suggestions from João Pereira on the user forum, here:

He notes:

I think is easy to resolve the problem of the dublin core. I made some modifications to the file _dc.xml.php in \plugins\sfDcPlugin\modules\sfDcPlugin\templates" of my localhost AtoM:

<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" 
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">

  <dc:title><?php echo esc_specialchars($resource->title) ?></dc:title>

  <?php foreach ($resource->getCreators() as $item): ?>
    <dc:creator><?php echo esc_specialchars($item) ?></dc:creator>
  <?php endforeach; ?>

  <?php foreach ($dc->subject as $item): ?>
    <dc:subject><?php echo esc_specialchars($item) ?></dc:subject>
  <?php endforeach; ?>

  <dc:description><?php echo esc_specialchars($resource->scopeAndContent) ?></dc:description>

  <?php foreach ($resource->getPublishers() as $item): ?>
    <dc:publisher><?php echo esc_specialchars($item) ?></dc:publisher>
  <?php endforeach; ?>

  <?php foreach ($resource->getContributors() as $item): ?>
    <dc:contributor><?php echo esc_specialchars($item) ?></dc:contributor>
  <?php endforeach; ?>

  <?php foreach ($dc->date as $item): ?>
    <dc:date><?php echo esc_specialchars($item) ?></dc:date>
  <?php endforeach; ?>

  <?php foreach ($dc->type as $item): ?>
    <dc:type><?php echo esc_specialchars($item) ?></dc:type>
  <?php endforeach; ?>

  <?php foreach ($dc->format as $item): ?>
    <dc:format><?php echo esc_specialchars($item) ?></dc:format>
  <?php endforeach; ?>

  <dc:identifier><?php echo url_for(array($resource, 'module' => 'informationobject'), true) ?></dc:identifier>

  <dc:identifier><?php echo esc_specialchars($resource->identifier) ?></dc:identifier>

  <dc:source><?php echo esc_specialchars($resource->locationOfOriginals) ?></dc:source>

  <?php foreach ($resource->language as $code): ?>
    <dc:language xsi:type="dcterms:ISO639-3"><?php echo strtolower($iso639convertor->getID3($code)) ?></dc:language>
  <?php endforeach; ?>

  <?php if (isset($resource->repository)): ?>
    <dc:relation><?php echo url_for(array($resource->repository, 'module' => 'repository'), true) ?></dc:relation>
    <dc:relation><?php echo esc_specialchars($resource->repository->authorizedFormOfName) ?></dc:relation>
  <?php endif; ?>

  <?php foreach ($dc->coverage as $item): ?>
    <dc:coverage><?php echo esc_specialchars($item) ?></dc:coverage>
  <?php endforeach; ?>

  <dc:rights><?php echo esc_specialchars($resource->accessConditions) ?></dc:rights>

</oai_dc:dc>

In essence, he has simply added the dc: prefix to the XML tags hardcoded in the template.

The code he is suggesting we change is here:

Since the OAI plugin actually calls in the DC XML from the DC plugin he recommends changing, this could have the added bonus of improving DC throughout the application. This appears to better conform to the examples provided by DCMI.

João also notes:

But because I am not a developer, I ask to test these changes and evaluate its possible application.
Thank you all!

#2 Updated by Jesús García Crespo about 7 years ago

  • Status changed from New to QA/Review
  • Assignee changed from Mike Cantelon to Dan Gillean

#3 Updated by Dan Gillean about 7 years ago

  • Status changed from QA/Review to Verified
  • Target version set to Release 2.2.0

This is great! And it improves our DC XML throughout the application. I ran a sample DC XML export through the W3C XML validator and it came back green! OAI seems to be working great, and Mark Triggs, who submitted the pull requests, told us that he has tested this with 2 different harvesters in Australia.

Thanks to both João and Mark!

#4 Updated by Dan Gillean about 7 years ago

  • Requires documentation set to Yes

This will require an update to the dc-template page in the Data Entry section of the user manual (updating the examples for each field of the DC XML), and the example responses in the OAI documentation.

#5 Updated by Dan Gillean about 7 years ago

  • Requires documentation deleted (Yes)

DC XML examples updated in 2.2 docs for DC template and OAI-PMH docs, in: https://github.com/artefactual/atom-docs/commit/842b969d7702f3a8e2c44dd930aba1ce2c9a78d9

Also available in: Atom PDF