Bug #5094

RAD EAD Dates improperly encoded when not Dates of creation

Added by Dan Gillean about 9 years ago. Updated about 7 years ago.

Status:VerifiedStart date:05/16/2013
Priority:HighDue date:
Assignee:Dan Gillean% Done:

0%

Category:EADEstimated time:2.00 hours
Target version:Release 2.2.0
Google Code Legacy ID: Tested version:1.3.1, 2.0.0, 2.0.1, 2.1
Sponsored:No Requires documentation:No

Description

To Reproduce
1) Create a new archival description in the RAD template
2) Add several events (Names and Dates) - creation, broadcasting, manufacture, contribution, etc.
3) Save the record and export as EAD XML

Error Encountered
The use of the EAD is improper and is also dropping data.

<unitdate> is currently being used to pass all dates, though the EAD Tag Library specifies that this element is to be used to indicate "creation year, month, or day of the described materials." the @DATECHAR attribute does allow for a "term characterizing the nature of the dates" but is generally used to qualify contributions from the creator, i.e. "creation" or "accumulation."

the @normal attribute, which should contain the dates in ISO acceptable format, is instead inserting "0/0" instead of the actual date for some of the events, such as "contribution" and "reproduction."

Finally, Places and Event notes were added for each of these, and none were present in the EAD export. The creator's event note will roundtrip, but places for all were dropped.

EXAMPLES
Please find attached:
1) a screenshot of the Names and Dates prior to export
2) a copy of the EAD XML that was exported
3) a screenshot of the results on re-import

The dates on Export were formatted in the EAD XML as such:

<unitdate datechar="broadcasting" normal="1999/1999" encodinganalog="3.1.3">1999</unitdate>
<unitdate normal="1894/2398" encodinganalog="3.1.3">1894 - 2398</unitdate>
<unitdate datechar="contribution" normal="0/0" encodinganalog="3.1.3">1956 -  May 1974</unitdate>
<unitdate datechar="reproduction" normal="0/0" encodinganalog="3.1.3">2156</unitdate>

The creator's note will roundtrip, because the creator also receives an event in the EAD, though no other do:

 <bioghist id="http://localhost:8080/~fiver/atom-1/index.php/reginald-lecreateur;isaar" encodinganalog="1.7B">
      <chronlist>
        <chronitem>
          <date type="creation" normal="18940000/23980000" >1894 - 2398</date>          <eventgrp>
            <event>
                              <note type="eventNote"><p>This is an event note for Reginald LeCreateur</p></note>
                                          <origination encodinganalog="1.7C">
                                  <name>Reginald LeCreateur</name>
                              </origination>
                          </event>
          </eventgrp>
        </chronitem>
      </chronlist>
    </bioghist>

Expected Result

I believe that we should be managing these other kinds of dates as events.

Here would be an example of how we could be encoding them:, based on the ridiculous case of Bobby Broadcaster in the examples attached to this issue.

<chronlist>
   <chronitem>
     <date type="broadcasting" normal="19990000/199990000">1999</date>
     <event>
       <note type="eventNote"><p>This is a note for Bobby Broadcaster</p></note>
       <name>Bobby Broadcaster</name>
       <geogname>XXXXXXX</geogname>
     </event>
   </chronitem>
   <chronitem>
    NEXT EVENTS WOULD GO HERE....
   </chronitem>
</chronlist>

I will be consulting with others to ensure this is the best way to go forward before anyone begins working on this. But either way, the <unitdate> is not working for these elements from the RAD template.

names_dates_pre-export.png - Examples of name and date events pre-export in RAD template (27.9 KB) Dan Gillean, 05/16/2013 02:02 PM

names-and-dates-test-fonds;ead.xml Magnifier - EAD XML output of the names and dates test fonds (3.63 KB) Dan Gillean, 05/16/2013 02:02 PM

names_dates_roundtrip.png - names and dates in AtoM RAD template after roundtrip (37.8 KB) Dan Gillean, 05/16/2013 02:03 PM

AtoM-fixingEADRoundtripping.pdf (418 KB) Dan Gillean, 04/11/2014 04:39 PM

dude.xml Magnifier (9.38 KB) Misty De Meo, 09/26/2014 06:01 PM


Related issues

Related to Access to Memory (AtoM) - Bug #4509: Place added in Event dialog not importing RAD template In progress 01/07/2013
Related to Access to Memory (AtoM) - Bug #4267: EAC and EAD export have URI for Authority Records Feedback
Related to AtoM Wishlist - Feature #5584: Add place access point to description when Place included... New 09/13/2013
Blocks Access to Memory (AtoM) - Bug #4505: RELATEDENCODING In RAD Template still points to ISAD Elem... Verified 01/07/2013
Blocked by Access to Memory (AtoM) - Bug #7498: EAD import and Export broken in 2.2.x branch Verified 11/10/2014

History

#1 Updated by Dan Gillean about 9 years ago

  • Category set to EAD
  • Assignee set to José Raddaoui Marín
  • Target version set to Release 1.4.0

#2 Updated by David Juhasz almost 9 years ago

The EAD revision Relax NG schema doesn't provide any clues as to where to put a <chronlist> for events in the history of a set of archival materials. The only changes to <chronlist> in EAD revised that I could find is that it has been removed from "m.para.content", which presumably means it can no longer be a direct child of a <p> element.

#3 Updated by Dan Gillean almost 9 years ago

I have posted the example of my proposed solution as part of a question to the EAD list-serv: http://listserv.loc.gov/cgi-bin/wa?A2=ind1307&L=ead&T=0&P=52

Hopefully, given the experts on the list (which includes members of the EAD Revision team), we will receive some responses that will offer alternate suggestions, and clarification as to whether or not my proposed structure makes sense. I will update the issue ticket with more information once I have it.

#4 Updated by Dan Gillean almost 9 years ago

  • Priority changed from Medium to High

#5 Updated by Jesús García Crespo almost 9 years ago

  • Estimated time set to 2.00

#6 Updated by Mike Gale about 8 years ago

It seems the incorrect dates issue has been resolved in 2.x. I'll look into the events not showing up tonight

#7 Updated by Dan Gillean about 8 years ago

Just to update this issue further:

We are still duplicating dates on roundtrip, and it is because of our strange use of the chronlist to embed information for creators; this also means that we are dropping data from users who create valid EAD where the bioghist and the origination elements are not embedded in a chronlist.

Following my post on the EAD list-serv, the one useful clarification I received (from the Chair of the EAD3 revision team, no less) was that <unitdate>, when using @datechar, is actually designed to contain other kinds of dates than creation - so we are already doing that correctly. The problem was in trying to roundtrip other fields available in the events modal for RAD.

Regardless, the chronlist is not doing it, and I no longer recommend this approach. I am attaching, instead, a proposal that, if it is possible to implement, will simplify our EAD and make it more compliant, solve the problem of dropping other people's compliant uses of bioghist and origination, solve the date duplication, and still roundtrip most of the data.

Waiting for developer feedback on what is possible and what is not. See attached PDF.

#8 Updated by David Juhasz over 7 years ago

Dan, In the most part I agree with your proposal in AtoM-fixingEADRoundtripping.pdf. I do have a few clarifying questions:

  1. Is the recommendation in EAD to not have an <unitdate @datechar> attribute for the creation date? As a programmer I would prefer to make this consistent with other <unitdate> types, e.g. <unitdate datechar="creation" normal="1900-1-1/1999-12-31" encodinganalog="3.1.3">January 1, 1900 - December 31, 1999</unitdate>
  2. Is it necessary to add <controlaccess><persname role="Creator"> in addition to the <origination> element? They will always be exactly the same as far as I know, so this seems redundant and a source of potential duplication.
  3. In the case of multiple <origination> and <bioghist> elements, I would assume that we would simply link them in order. E.g. the 1st <bioghist> becomes the history for the 1st <origination> authority record, the 2nd <bioghist> goes with the 2nd <origination>, etc. Is this how you would expect multiple <origination> + <bioghist> elements to work?

I would also recommend we drop the <bioghist @id> attribute. This was implemented in an attempt to link together instances of a single authority record within the EAD document, but I don't think it serves any purpose anymore.

Regarding the first point in the "ON IMPORT" section, "Match unitdates @datechar to the equivalent name's @role and the geogname's @role" I think we will have to create two separate events for the creation (or accumulation) of the archival materials:
  1. One event for the <origination> value - e.g the creator name
  2. One event for the <unitdate> (e.g. the creation date) and <geogname> (e.g. creation location) as these can be linked by the @role attribute
    I can't see any rigorous way to link an <origination> to a <unitdate> or <geogname> unless they all share a linking type (e.g. "creator/creation"). This compromise is similar to what we do with archival descriptions entered using the ISAD template - separate creator and creation date events are created.

#9 Updated by Dan Gillean over 7 years ago

  1. Is the recommendation in EAD to not have an <unitdate @datechar> attribute for the creation date? As a programmer I would prefer to make this consistent with other <unitdate> types, e.g. <unitdate datechar="creation" normal="1900-1-1/1999-12-31" encodinganalog="3.1.3">January 1, 1900 - December 31, 1999</unitdate>

That sounds fine with me. The EAD 2002 Tag library only says that "The DATECHAR attribute can be used to supply a term characterizing the nature of the dates, such as creation or accumulation." Looking at many examples from other institutions, though, many do not add it to the creation date (such as in the 2 examples on the <unitdate> page).

So: as long as we are not REQUIRING it on import - and missing dates of creation if they do not have it - then I think it is fine, perhaps even better, to add it. I think on import, if there is no @DATECHAR, we should assume it's a creation date?

  1. Is it necessary to add <controlaccess><persname role="Creator"> in addition to the <origination> element? They will always be exactly the same as far as I know, so this seems redundant and a source of potential duplication.

No it's not necessary - AtoM is currently adding it on export - my main interest was in seeing it not show up on import if it has the @role="Creator" because right now, the role is not being checked, so on roundtrip you end up with a duplicate - the original access point becomes a name (subject) access point, and a second one is automatically added by AtoM based on the creator name.

  1. In the case of multiple <origination> and <bioghist> elements, I would assume that we would simply link them in order. E.g. the 1st <bioghist> becomes the history for the 1st <origination> authority record, the 2nd <bioghist> goes with the 2nd <origination>, etc. Is this how you would expect multiple <origination> + <bioghist> elements to work?

Yes, i think this is probably the best we can do.

I would also recommend we drop the <bioghist @id> attribute. This was implemented in an attempt to link together instances of a single authority record within the EAD document, but I don't think it serves any purpose anymore.

Sounds good to me!

On your other response to the "ON IMPORT" section - damn. If that's the best we can do, I guess that's that - though I think it worth thinking on further if possible. At least for creation events (the most common, and the one on which the <origination> element is invoked) It would be nice to fix the way that 2 separate events exist! But yeah, if it's not possible then we make it as good as we can in the interface and DB, and make sure the EAD that comes out is consistent.

#10 Updated by Dan Gillean over 7 years ago

  • Target version deleted (Release 1.4.0)
  • Tested version 1.3.1, 2.0.0, 2.0.1, 2.1 added

#11 Updated by Dan Gillean over 7 years ago

  • Assignee changed from José Raddaoui Marín to Misty De Meo

Reassigned to Misty as it is relevant to a data migration.

#12 Updated by Misty De Meo over 7 years ago

Took a stab at removing <chronlist> on export.

I've attached an export of a test fonds from the test server (imported from its EAD export). Here's the bioghist and origination section:

          <bioghist id="md5-2328b1c22e60cf4f8181a82beca22922" encodinganalog="3.2.2">
              <note><p>...</p></note>
                    <date type="existence">0 - 3999</date>
          </bioghist>

    <origination encodinganalog="3.2.1">
                        <persname>Dudely Dude the Third</persname>
                                  </origination>

#13 Updated by Misty De Meo over 7 years ago

  • Target version set to Release 2.1.1

#14 Updated by Misty De Meo over 7 years ago

Updated geographic names added in the event module to specify `role="creation"`.

<controlaccess>
  ...
  <geogname>England.</geogname>
  <geogname>France.</geogname>
  <geogname>Halifax (N.S.) </geogname>
  <geogname role="creation">France.</geogname>
</controlaccess>

Dan - does this provide enough information for what you wanted on reimport? Or should there be more here?

#15 Updated by Dan Gillean over 7 years ago

My worry, upon closer examination, is that this will not allow us to re-associate multiple places with their respective events when multiple events are listed. Ideally, the @role element can reference the specific type of event as in the example in the PDF:

<controlaccess>
    <persname role="Accumulator">Fiver Watson</persname>
    <persname role="Broadcaster">Bobby Broadcaster</persname>
    <persname role="Creator">Bushey, Jessica (1976 - )</persname>
    <name role="subject">Dan Gillean</name>
    <geogname role="broadcasting">Lala land</geogname>
    <geogname role="accumulation">Shangri La</geogname>
</controlaccess>

If this is not possible, we might have to go back to the drawing board :[

#16 Updated by Misty De Meo over 7 years ago

Updated to do that:

<geogname>England.</geogname>
<geogname>France.</geogname>
<geogname>Halifax (N.S.) </geogname>
<geogname role="reproduction">France.</geogname>

#17 Updated by Misty De Meo over 7 years ago

Updated again: @role now uses the getRole() function, and does not attempt to change capitalization or camelCase. This is consistent with the other @role attributes.

#18 Updated by Misty De Meo over 7 years ago

  • Status changed from New to QA/Review
  • Assignee changed from Misty De Meo to Dan Gillean

The first half of the EAD changes are complete; all of the EAD export functionality is done. The EAD import changes still need to be done.

#19 Updated by Misty De Meo over 7 years ago

A pull request for the code to preserve RAD event actors and places is up at https://github.com/artefactual/atom/pull/59.

#20 Updated by David Juhasz over 7 years ago

  • Assignee changed from Dan Gillean to Mike Gale

Mike, can you code review Misty's pull request please?

#21 Updated by Dan Gillean over 7 years ago

  • Assignee changed from Mike Gale to Dan Gillean
  • Target version changed from Release 2.1.1 to Release 2.2.0
  • Requires documentation set to No

Mike Gale completed code review, and the PR was closed and merged into the qa/2.2.x branch. Because this changes behavior, it was decided to put it in 2.2 rather than 2.1.1

#22 Updated by José Raddaoui Marín over 7 years ago

I've fixed a typo in qa/2.2.x, commit: 1dbf15cd09c276469d4777ffbd26aeeb99c86878 (Not sure if the test site needs deploy or it's still auto-updating the code ...)

#23 Updated by Dan Gillean about 7 years ago

  • Status changed from QA/Review to Verified

Also available in: Atom PDF