Bug #4276

Dublin Core export creates non-compliant XML documents

Added by Anonymous about 10 years ago. Updated over 9 years ago.

Status:DuplicateStart date:
Priority:HighDue date:
Assignee:José Raddaoui Marín% Done:

100%

Category:Import/Export
Target version:Release 1.4.0
Google Code Legacy ID:atom-2328 Tested version:
Sponsored:No Requires documentation:

Description

Google user: jessica%...@gtempaccount.com

To reproduce this error:
1)Go to CVA: http://searcharchives.vancouver.ca/railroad-trestle-tracks-and-b-m-s-co-ltd-freight-cars-at-1200-foot-level-at-britannia-mines;rad

2)Select Dublin Core Export

Resulting error:
Chrome: This page contains the following errors: "error on line 8 at column 46: xmlParseEntityRef: no name"
Firefox: XML parsing error: not well formed

Expected result:
View XML for record

Some of the “Dublin Core” files are non-compliant XML documents.
Consider the file for Item “AM54-S4-3-: PAN N38″. The culprit is the “&”
character in the title. Outside of a CDATA section, it’s seen as a
syntax marker and not as a shorthand for “and”. For compliance, it needs
to be replaced by the predefined character entity “&”.
Some XML parsers, like the one used by Opera browser, are not forgiving
and reject the whole document.

[g] Legacy categories: Information object, Import/Export


Related issues

Related to Access to Memory (AtoM) - Bug #3809: Parsing error in MODS and DC when title contains an amper... Verified
Related to Access to Memory (AtoM) - Bug #4302: DC XML import/export roundtripping Verified

History

#1 Updated by Anonymous about 10 years ago

Tested:
Export CVA record as EAD XML.
Import into Trunk.
Export record as DC XML.
XML parsing error: <title>[Railroad trestle, tracks and B.M. & S. Co. Ltd. freight cars at 1200 foot level at Britannia Mines]</title>

#2 Updated by Jessica Bushey almost 10 years ago

[g] New owner: Jessica Bushey

#3 Updated by Dan Gillean almost 10 years ago

Suggest merging this with Issue 3809 (https://projects.artefactual.com/issues/3809) - both issues identify the ampersand as the culprit.

#4 Updated by Jesús García Crespo almost 10 years ago

[g] New owner: Jesús García Crespo

#5 Updated by David Juhasz over 9 years ago

  • Target version changed from Release 2.1.0 to Release 1.4.0
  • Sponsored set to No

#6 Updated by Dan Gillean over 9 years ago

  • Assignee changed from Jesús García Crespo to José Raddaoui Marín

#7 Updated by Dan Gillean over 9 years ago

  • Description updated (diff)

#8 Updated by José Raddaoui Marín over 9 years ago

  • Status changed from New to QA/Review
  • % Done changed from 0 to 100

Applied in changeset atom|commit:8aaf9e4d6c5b125743463c45ff4dd66543cf0366.

#9 Updated by Jesús García Crespo over 9 years ago

  • Category set to Import/Export

#10 Updated by Dan Gillean over 9 years ago

  • Status changed from QA/Review to Duplicate

Issue marked as duplicate; see issue 3809 for work on fixing the ampersand.

As for testing, I cannot test this based on the instructions included here because issue #4302 is still outstanding: roundtripping in Dublin Core results in an empty information object. Jessica is testing 3809 and will comment there.

Also available in: Atom PDF