Bug #5867

<physdesc> data dropped during import

Added by Creighton Barrett over 8 years ago. Updated over 8 years ago.

Status:VerifiedStart date:10/25/2013
Priority:MediumDue date:
Assignee:Mike Gale% Done:

0%

Category:Import/Export
Target version:Release 2.0.1
Google Code Legacy ID: Tested version:
Sponsored:No Requires documentation:

Description

<physdesc> is dropped from imported EAD XML. Here's an example:

<archdesc level="fonds">
      <did>
         <unittitle>
            <persname role="creator">John Godfrey</persname> fonds</unittitle>
         <unitid>MS-2-575</unitid>
         <repository>
            <corpname>Dalhousie University Archives</corpname>
         </repository>
         <langmaterial>
            <language langcode="eng"/>
         </langmaterial>
         <physdesc>3.25 m of textual records and other material. - 10 videocassettes. - 19 photographs. - 3 audio reels</physdesc>

The file imports fine but the physical description data is dropped. This happens at all levels of description.


Related issues

Related to Access to Memory (AtoM) - Bug #5212: EAD <dimensions> element fails to import into AtoM Verified 06/10/2013
Related to Access to Memory (AtoM) - Bug #5340: Extent field adding text upon EAD import Verified 07/12/2013

History

#1 Updated by Creighton Barrett over 8 years ago

Actually, it looks like this only happens when <physdesc> does not have the nested sub-elements (e.g., <extent>, <dimensions>)...Still poking around to see the extent of the issue.

#2 Updated by Dan Gillean over 8 years ago

  • Assignee changed from Mike Gale to José Raddaoui Marín
  • Target version set to Release 2.0.1

Agreed that this is a fix we should make. As the EAD 2002 Tag Library notes for physdesc, "The information may be presented as plain text, or it may be divided into the <dimension>, <extent>, <genreform>, and <physfacet> subelements." Therefore we should not be dropping information that is not in the subelements.

#3 Updated by Creighton Barrett over 8 years ago

I admit that this was a self-made problem for us :( When we noticed that 1.x was dropping multiple <extent> notes and generally having trouble with our <physdesc> data, our developer tweaked our XSLT to merge multiple notes into a single <physdesc> element with the appropriate RAD punctuation. Data that was translated with the earlier version of the stylesheet imports okay. But yes, <phsydesc> data can be presented as plain text with no subelements, so it would be great for the import to accept both types of data.

#4 Updated by Dan Gillean over 8 years ago

  • Priority changed from Medium to High

This has become relevant to a client migration project. Bumping up priority. We should also make sure that any <name> ( corpname, persname, famname) elements inside of physdesc still import properly as well.

See: #5212 and #5340 for previous related work on <physdesc>

Examples that should work when imported:

<physdesc>4 volumes and 1 folder.</physdesc>

<physdesc><extent>2.5 linear ft.</extent></physdesc>

<physdesc>
<extent>3 </extent>
<genreform>daguerreotypes, </genreform>
<physfacet>hand colored</physfacet>
</physdesc>

<physdesc>
<physfacet type="material">Paper</physfacet>
<physfacet type="ruling">Ruled in red ink</physfacet>
<physfacet type="watermarks">Briquet 1234</physfacet>
<physfacet type="binding">Bound in 19th century red leather</physfacet>
</physdesc>

<physdesc>
<extent>100 boxes; </extent>
<extent>50 linear feet</extent>
</physdesc>

Ideally, when possible, we can preserve the HTML definition list solution we came up with in #5212, or a better method so the user's data will display correctly in AtoM but on export, a user's internal tags in physdesc (such as <extent>, <dimensions>, etc) are preserved. When not possible, all data should still import correctly even if it just appears as PCDATA in <physdesc> - though this is the less ideal solution.

#5 Updated by Dan Gillean over 8 years ago

  • Priority changed from High to Critical

#6 Updated by Jessica Bushey over 8 years ago

  • Assignee changed from José Raddaoui Marín to Mike Gale

This is necessary to fix prior to importing the Wits EAD.

#7 Updated by Mike Gale over 8 years ago

  • Assignee changed from Mike Gale to Dan Gillean

#8 Updated by Mike Gale over 8 years ago

This should be fixed now, also, I added <physdesc><genreform>...</genreform></physdesc> support, as WITS uses this sub-tag extensively in their EAD

#9 Updated by Mike Gale over 8 years ago

  • Status changed from New to Feedback
  • Assignee changed from Dan Gillean to Mike Gale
  • Priority changed from Critical to Medium

Dan wants to find a better label for the genreform so I'll hold off verifying this for now

#10 Updated by Dan Gillean over 8 years ago

  • Status changed from Feedback to Verified

Also available in: Atom PDF