Bug #8458

Origination and bioghist elements don't always line up properly

Added by Mike Gale about 7 years ago. Updated almost 7 years ago.

Status:VerifiedStart date:05/14/2015
Priority:HighDue date:
Assignee:Dan Gillean% Done:

0%

Category:EAD
Target version:Release 2.2.0
Google Code Legacy ID: Tested version:2.2
Sponsored:No Requires documentation:

Related issues

Related to Access to Memory (AtoM) - Bug #8385: AtoM EAD is not compliant with DTD, causes DTD warnings o... Verified 05/04/2015

History

#1 Updated by Mike Gale about 7 years ago

  • Category set to EAD
  • Assignee set to Mike Gale
  • Priority changed from Medium to High
  • Target version set to Release 2.2.0
  • Tested version 2.2 added

Example where this is an issue: http://qa-22x.test.artefactual.com/advocates-society-collection

To reproduce:
- Export EAD for a Fonds that has multiple creators with histories in them, and with lower level children records that also have multiple creators/histories
- Import the EAD file into a fresh AtoM install

Expected result:
- The creator names line up with their expected biographical histories

Actual result:
- The creator names and histories get mixed up. E.g., in the above example fonds William gets Aaron's history when imported into a fresh AtoM install

#2 Updated by Mike Gale about 7 years ago

  • Subject changed from origination and bioghist elements don't always line up properly to Origination and bioghist elements don't always line up properly

#3 Updated by Mike Gale about 7 years ago

  • Related to Bug #8385: AtoM EAD is not compliant with DTD, causes DTD warnings on roundtrip added

#4 Updated by Mike Gale almost 7 years ago

Further info:

This seems to occur when you have a mix of corpname and persname entries in the EAD file.

In the example previously mentioned in this ticket, the EAD file has the following tags:

<corpname>Advocates&#039; Society</corpname>
<persname>Aaron, Robert Bernard</persname>
<persname>Babcock, William J.</persname>

... and their corresponding bioghists below.

However, the xpath to parse corpname comes after the xpath to parse persname in our code, so this makes the creators out of line with the order of the bioghists.

#5 Updated by Mike Gale almost 7 years ago

  • Status changed from New to Code Review
  • Assignee changed from Mike Gale to José Raddaoui Marín

Unfortunately after spending a couple hours trying to implement ids in both origination / bioghist and matching them that way, I realized that this means we'd have even more bloated code (supporting the chronlist stuff, the 'just hope they line up' way as we currently use, and the ids matching way).

I decided the best compromise was to have the xpath parse the origination tags, rather than the famname/corpname/persname values directly. That way the order of adding creator names shouldn't change based on the entity type, and things should line up properly. You could still maybe get them out of order if you had multiple persnames/corpnames mixed under a single origination tag, but I think while technically legal (as far as I know), that this is an edge case.

https://github.com/artefactual/atom/pull/178

#6 Updated by José Raddaoui Marín almost 7 years ago

  • Status changed from Code Review to In progress
  • Assignee changed from José Raddaoui Marín to Mike Gale

It looks great Mike! Just a little cosmetic change in some comments.

#7 Updated by Mike Gale almost 7 years ago

  • Status changed from In progress to QA/Review
  • Assignee changed from Mike Gale to Dan Gillean

merged qa/2.2.x

#8 Updated by Dan Gillean almost 7 years ago

going to call this good enough. Tested with both ISAD and RAD.

AtoM will spit out an empty bioghist element if one of multiple creators does not have an associated bioghist, so the ordering can be kept consistent. This will cause the DTD to throw a warning on re-import, but it works.

In RAD, when events are associated with specific Actors, non-creation events (such as publication, broadcast) will roundtrip perfectly. Creation events become dissociated from their creators on re-import - when you look at the edit template in the dates area, you can see a row with creation dates but no actor, and then a row with an actor (creator) but no dates. So, if you had 2 creators participating at different times (e.g. 2000 for creator A; 2001 for creator B), this information would be lost on roundtrip - you would see creation events in both 2000 and 2001, and that Creator A and Creator B are both creators, but the dates would not be associated with a specific actor any longer.

It may not be possible to improve this due to the differences in how events are handled in ISAD/DACS vs RAD - there is no way to associate an actor directly with an event in the ISAD and DACS templates, so this import behavior in RAD reflects how it is handled elsewhere.

Going to discuss with developers before flipping ticket.

#9 Updated by Dan Gillean almost 7 years ago

  • Status changed from QA/Review to Verified

Going to call this good enough for now. We can always open a new issue ticket to address the points raised in comment 8, above - but the problems associated with the original description of this issue have been resolved.

Also available in: Atom PDF