Bug #6141

General note <odd> dropped during XML import

Added by Creighton Barrett over 8 years ago. Updated almost 6 years ago.

Status:VerifiedStart date:12/17/2013
Priority:MediumDue date:
Assignee:Dan Gillean% Done:

0%

Category:EAD
Target version:Release 2.3.0
Google Code Legacy ID: Tested version:
Sponsored:No Requires documentation:

Description

We've noticed that data encoded with <odd><p> is dropping during the XML import:

An example:

 <c02 id="ref144" level="file">
               <did>
                  <unittitle>Equity : Refunds to members</unittitle>
                  <unitid>MS-4-112, Box 9, Folder 11</unitid>
                  <container id="cid679476" type="Box-folder" label="Text">Box 9, Folder 11</container>
                  <unitdate>1957-1974</unitdate>
               </did>
               <odd id="ref145">
                  <p>Includes correspondence and memos.</p>
               </odd>
            </c02>

I note from the EAD tag library that <p> is allowed inside <odd>:

http://www.loc.gov/ead/tglib/elements/odd.html

This seems to happen in 2.0 and our earlier 1.x test site.


Related issues

Related to Access to Memory (AtoM) - Bug #6143: ISAD(G) and DACS general notes do not crosswalk to RAD ot... Verified 12/17/2013
Related to Access to Memory (AtoM) - Bug #6145: Unnecessary spacing in EAD elements, and between elements... Verified 12/17/2013
Related to Access to Memory (AtoM) - Bug #9868: Description ID EAD mapping is importing as a general note Feedback 05/19/2016
Related to Access to Memory (AtoM) - Bug #9869: Unrecognized note types in EAD import will cause 500 error Verified 05/19/2016

History

#1 Updated by Dan Gillean over 8 years ago

  • Category set to EAD
  • Assignee set to Jesús García Crespo
  • Target version set to Release 2.0.2

Hi Creighton,

Thanks for reporting this. The challenge with mapping <odd> on import seems to have arisen from the inherent challenges in mapping many of the RAD fields, which do not conform to ISAD or EAD in a 1:1 way. Consequently, we use the <odd> element in a variety of ways when exporting RAD descriptions. For example:

FROM THE DESCRIPTION CONTROL AREA:
<odd type="publicationStatus"><p>published</p></odd>
<odd type="descriptionIdentifier"><p>Description ID</p></odd>
<odd type="institutionIdentifier"><p>Institution ID</p></odd>

TITLE NOTES:
<odd type="titleVariation" ><p>Title notes: Variations in title</p></odd>
<odd type="titleAttributions" ><p>Title notes: Attributions and conjectures</p></odd>
<odd type="titleContinuation" ><p>Title notes: Continuation of title</p></odd>
<odd type="titleStatRep" ><p>Title notes: Statement of responsibility</p></odd>
<odd type="titleParallel" ><p>Title notes: Parallel titles and other title information</p></odd>
<odd type="titleSource" ><p>Title notes: Source of title proper</p></odd>

OTHER NOTES:
<odd type="edition" encodinganalog="1.8B7"><p>Other notes: Edition (RAD 1.8B7)</p></odd>
<odd type="physDesc" ><p>Other notes: Physical description (RAD 1.8B9)</p></odd>
<odd type="conservation" encodinganalog="1.8B9b"><p>Other notes: Conservation (RAD 1.8B9b)</p></odd>
<odd type="material" encodinganalog="1.5E"><p>Other notes: Accompanying material (RAD 1.5E)</p></odd>
<odd type="alphanumericDesignation" ><p>Other notes: Alpha-numerical designations (RAD 1.8B11)</p></odd>
<odd type="bibSeries"><p>Other notes: Publisher's series </p></odd>
<odd type="rights" encodinganalog="1.8B16b"><p>Other notes: Rights</p></odd>
<odd type="general" encodinganalog="1.8B21"><p>Other notes: General note (RAD 1.8B21)</p></odd>

Further, when testing for this, I actually noted that our ISAD and DACS general notes and the RAD other notes: general note type are not currently crosswalking between templates, and are in fact using different EAD. In the ISAD and DACS templates:

<note type="generalNote">
  <p>This is a general note in ISAD and DACS</p>
</note>

I have filed this as a bug for consideration in 2.0.2, as issue #6143. I think that it makes sense to stick with the <note> element for general notes across templates.

I think what we could do to the import script is say that any <note> OR <odd> element imported with out a @TYPE gets concatenated (with a line break between) to the general notes field. We can't predict the target field more accurately without a @TYPE, but this should ensure that data is not dropped on import.

Do you think this solution makes sense?

In the meantime, if you modify your EAD to one of the above examples, depending on the target field, it should work.

#2 Updated by Tim Hutchinson over 8 years ago

I see that if <odd> doesn't have a type attribute, it imports as OTHER_DESCRIPTIVE_DATA_ID; see the end of:
https://github.com/artefactual/atom/blob/2.x/apps/qubit/modules/object/config/import/ead.yml

I'm wondering if this is a note id that's no longer used? E.g. previously used for ISAD?

#3 Updated by Creighton Barrett over 8 years ago

Thanks for the quick responses guys. Complicated!

In our case, the <odd> notes are a legacy from our MS Access days. The vast majority of our <odd> notes are actually scope and content notes (like the one in the example above), so I should've converted them to <scopecontent> before migrating our legacy EAD into the Archivists' Toolkit.

We're going to try an SQL query to correct the notes in our database (to correct the source problem) and then do some kind of script to batch edit the EAD we've already exported and transformed with our XSLT (to avoid having to re-export and transform everything). So I don't think it will be much of a problem for us moving forward, but I thought it would be good to report the bug because I imagine others will run into it if they are importing XML. And there is the crosswalking issue.

It's really interesting to see how the <odd> "type" attribute is used to get around RAD's clunkiness.

#4 Updated by Jesús García Crespo over 7 years ago

  • Target version changed from Release 2.0.2 to Release 2.1.0

#5 Updated by Jesús García Crespo over 7 years ago

  • Target version changed from Release 2.1.0 to Release 2.2.0

#6 Updated by Tim Hutchinson over 7 years ago

I tested a bit further. <odd> without a type attribute gets imported as type_id=126 (OTHER_DESCRIPTIVE_DATA_ID). That doesn't seem to display in any template, or export.

As opposed to ISAD notes (125) or RAD General Note (241).

So a workaround in the RAD template should be to update note.type_id from 126 to 241.

#7 Updated by Sarah Romkey about 7 years ago

  • Target version deleted (Release 2.2.0)

#8 Updated by Jesús García Crespo over 6 years ago

  • Assignee deleted (Jesús García Crespo)

#9 Updated by Nick Wilkinson about 6 years ago

  • Assignee set to José Raddaoui Marín

#10 Updated by Nick Wilkinson about 6 years ago

  • Assignee changed from José Raddaoui Marín to Mike Cantelon

#11 Updated by Mike Cantelon about 6 years ago

  • Status changed from New to Code Review
  • Assignee changed from Mike Cantelon to Nick Wilkinson

Changed it so <odd> with no type imports as a general note (125, now used by both ISAD and RAD templates as per https://github.com/artefactual/atom/commit/7ff3bbcdd5000502bc98b755aafad3d20dcade88).

PR for code review: https://github.com/artefactual/atom/pull/346

#12 Updated by Nick Wilkinson about 6 years ago

  • Assignee changed from Nick Wilkinson to José Raddaoui Marín

Hi Radda, assigning to you for CR.

#13 Updated by José Raddaoui Marín about 6 years ago

  • Status changed from Code Review to Feedback
  • Assignee changed from José Raddaoui Marín to Mike Cantelon

Hi Mike, code looks good, but I'm not so sure about adding the '[not(@type)]' limitation in the selector. We may want to let other types that are not being checked before to be imported as general note too.

#14 Updated by Dan Gillean about 6 years ago

I agree - if we don't have a mapping, we should import them as general notes.

#15 Updated by Mike Cantelon about 6 years ago

  • Status changed from Feedback to Code Review
  • Assignee changed from Mike Cantelon to José Raddaoui Marín

Make sense. I've updated the PR, allowing the generic "odd" to run regardless of whether it has a "type" attribute. I added some logic to make sure the node hasn't already been handled by a more specific "odd" handler before using the generic one.

#16 Updated by José Raddaoui Marín about 6 years ago

  • Status changed from Code Review to Feedback
  • Assignee changed from José Raddaoui Marín to Mike Cantelon

Awesome! Thanks Mike, just what Sevein said in the PR.

#17 Updated by Mike Cantelon about 6 years ago

  • Status changed from Feedback to QA/Review
  • Assignee changed from Mike Cantelon to Dan Gillean

I've merged the fix so ready for QA.

#18 Updated by Dan Gillean about 6 years ago

  • Related to Bug #9868: Description ID EAD mapping is importing as a general note added

#19 Updated by Dan Gillean about 6 years ago

  • Related to Bug #9869: Unrecognized note types in EAD import will cause 500 error added

#20 Updated by Dan Gillean about 6 years ago

  • Status changed from QA/Review to Feedback
  • Assignee changed from Dan Gillean to Mike Cantelon

I'm a bit unclear on what I thought the outcome would be as discussed by Radda, me, and you in the comments above, so I'm marking this for feedback for now. Essentially: I can confirm that an <odd> element without a type attribute will import as a general note now. HOWEVER, please see issue #9869 which I just filed - if you try to import an unrecognized @type on an <odd> element, you get a 500 error. We can deal with that on the new ticket, but I thought that radda's comment 13 and yours at 15 meant we were handling unfamiliar types?

#21 Updated by Mike Cantelon about 6 years ago

It worked for me with no type specified or a fake type, but the #9869 case does indeed cause an error, oddly. I'll isolate the bug.

#22 Updated by Mike Cantelon about 6 years ago

  • Assignee changed from Mike Cantelon to Dan Gillean

Yeah, this bug only happens if type="general". Will fix in other issue.

#23 Updated by Dan Gillean almost 6 years ago

  • Status changed from Feedback to Verified
  • Target version set to Release 2.3.0

Also available in: Atom PDF