Bug #3850

Container and Physical location fields not exporting

Added by Evelyn McLellan about 9 years ago. Updated over 6 years ago.

Status:VerifiedStart date:
Priority:HighDue date:
Assignee:José Raddaoui Marín% Done:

100%

Category:EADEstimated time:3.00 hours
Target version:Release 1.4.0
Google Code Legacy ID:atom-1901 Tested version:
Sponsored:Yes Requires documentation:

Description

To reproduce this error:
1)Go to show screen for an information object with attached physical storage
2)Export the record

Resulting error:
No physloc fields in the export file

Expected result:
Should be something like <physloc>Shelf 11, Aisle C10, Main Repository</physloc> in export file

[g] Legacy categories: Import/Export, EAD

test_physical_location.png (71.2 KB) José Raddaoui Marín, 04/15/2013 07:03 AM

test-physical-location.xml Magnifier (2.99 KB) José Raddaoui Marín, 04/15/2013 07:03 AM

History

#1 Updated by Evelyn McLellan about 9 years ago

  • Subject set to Container and Physical location fields not exporting

The <container> field is not exporting either. There should be a field like <container type="Box">Box D9</container>. Am updating the issue title to reflect this.

#2 Updated by David Juhasz over 8 years ago

[g] Labels added: Component-EAD
[g] New owner: MJ Suhonos

#3 Updated by David Juhasz over 8 years ago

  • Priority set to Medium

[g] Labels added: Priority-Medium

#4 Updated by David Juhasz almost 8 years ago

  • Target version set to Release 1.3

Roll over to Release 1.3

[g] Labels added: Milestone-Release-1.3

#5 Updated by Jesús García Crespo over 7 years ago

[g] New owner: David Juhasz

#6 Updated by David Juhasz over 7 years ago

Reassign to new account.

[g] New owner: David Juhasz

#7 Updated by Jessica Bushey about 7 years ago

Physical container is being exported but only when it is included at a lower-level of a multi-level description. The physical container for the fonds is not recorded in EAD export, but the physical container for the series (child level) is recorded in the EAD export. Upon Import the physical container for the series is accurate and represented, but not for the fonds level.

Below is the sample EAD.XML from the export for the series level physical container.

<did>
<physloc>cold storage, shelf 8A</physloc>
<container type="Folder">SPCA 01-A</container>
<unittitle encodinganalog="3.1.2">Client files</unittitle><unitid encodinganalog="3.1.1">01</unitid>
<unitdate datechar="creation" normal="1922/2010" encodinganalog="3.1.3">1922 - 2010</unitdate>
<repository><corpname>Delta Archives</corpname></repository>
</did>

#8 Updated by Jessica Bushey about 7 years ago

  • Target version changed from Release 1.3 to Release 2.1.0

[g] Labels added: Milestone-Release-2.0
[g] Labels removed: Milestone-Release-1.3

#9 Updated by Mike Cantelon about 7 years ago

Should be fixed.

#10 Updated by Mike Cantelon about 7 years ago

  • Status changed from New to QA/Review

[g] New owner: Jessica Bushey

#11 Updated by Jessica Bushey about 7 years ago

  • Status changed from QA/Review to Feedback

Sorry but it is still missing from both ISAD and RAD export to EAD. I attached physical storage to a fonds and a series, but nothing is showing-up in the EAD.

[g] New owner: Mike Cantelon

#12 Updated by David Juhasz almost 7 years ago

  • Category set to EAD
  • Priority changed from Medium to High
  • Target version changed from Release 2.1.0 to Release 1.4.0
  • Sponsored set to Yes

#13 Updated by David Juhasz almost 7 years ago

  • Description updated (diff)

#14 Updated by Dan Gillean over 6 years ago

  • Assignee changed from Mike Cantelon to José Raddaoui Marín

#15 Updated by José Raddaoui Marín over 6 years ago

Now, AtoM is exporting only <container>, <physloc> doesn't appear at the export.

I have tried with ISAD and RAD templates, with Fonds, Series and Files in a three level hierarchy, with differents physical locations, and a lot of posibilities, and for me <container> appears when it should.

Could you give me more information?

#16 Updated by Jessica Bushey over 6 years ago

Radda,

The physical container information is roundtripping, BUT it is creating a Warning upon Import: syntax for attribute type of container is not valid.

This is suggested:

<container type="box">XX</container>

We are using this:

<container type="box">archival storage<title>XX</title></container>

#17 Updated by David Juhasz over 6 years ago

  • Estimated time set to 3.00

#18 Updated by José Raddaoui Marín over 6 years ago

This warning appears when the type attribute has spaces. In the begining it fails with "Cardboard box", "Hollinger box", "Filing cabinet" and "Map cabinet". But, when importing, if the type does not match with any term, it will create a new term and type; so, if a new type with spaces is created, it will create a warning too.

And if we use <container type="box">XX</container> instead of <container type="box">archival storage<title>XX</title></container> we will loose the location in the import.

How should we export types with spaces?

#19 Updated by Jessica Bushey over 6 years ago

<container type="Hollinger box">
Aisle 4, Shelf 5 <title>JFF-01</title>
</container>

We were getting: "Syntax value for attribute type of container is not valid".

So I changed it to lowerCamelCase:

<container type="hollingerBox">
Aisle 4, Shelf 5 <title>JFF-01</title>
</container>

No more warning!

#20 Updated by Dan Gillean over 6 years ago

My suggestion is to keep the fix for this issue as close to available EAD examples as possible - in most cases I have seen, only the @TYPE of "box" or "folder" have been used. Therefore, I think that we should keep our types simple - box, folder, shelf, etc - and if possible, use the @LABEL attribute for specifying which specific kind of box.

Example: user selects Hollinger box from the container type drop down list in AtoM: EAD will be

<container type="box" label="hollinger"><title>[USER_DATA_FROM_CONTAINER_NAME_FIELD]</title></container>

Location information should not be captured inside of the <container> field. This data should be using the <physloc> EAD element. See: http://www.loc.gov/ead/tglib/elements/physloc.html

Here is an example from the physloc element:

    <c02 level="file">
        <did>
            <physloc>112.I.8.1B-2</physloc>
            <container type="box">2</container>
            <unittitle><unitdate type="inclusive">December 1908-July 1917
            </unitdate></unittitle>
        </did>
    </c02>

#21 Updated by José Raddaoui Marín over 6 years ago

Hi Dan,

I've added the label attribute for those elements with spaces but we have a problem with the location information, if we use <physloc>, the location and the container will be separated, which will make difficult to join them in the import, because we have the posibility of having more than one physical storage for an object.

#22 Updated by Dan Gillean over 6 years ago

Hi Radda,

Each <physloc> and <container> will be contained within the <did> of a <c> level description. It's true that a <physloc> may hold many containers, but that information will be repeated in the export/import within every <did> to which it applies. If there are multiple containers within the same <did> and only one <physloc>, it is reasonable to assume that each container belongs to the same physical location. I am not sure I see the problem in uniting them, but perhaps I don't understand the scripts that you are creating to manage the import. If we assume that any <container> listed in a <did> belongs to the <physloc> within the same <did>, is the import still challenged? And if so, can you clarify so I can better answer your question? Thanks.

#23 Updated by José Raddaoui Marín over 6 years ago

An information object can have more than one physical storage, each physical storage has a location and a container. If we use <physloc> and <container> there will be more than one for each one inside the <did> element. I'm attaching an example with the work made so far.

#24 Updated by Dan Gillean over 6 years ago

Hi Radda,
What if we were to use the physical location provided in the interface (ie, the physloc element) as an id for the container?

<physloc>Almacén 32</physloc>
<container type="folder" id="Almacén 32">
    <title>carpeta_A</title>
</container>
<physloc>Almacén 32</physloc>
<container type="box" id="Almacén 32">
     <title>house in a tree</title>
</container>
<container type="box" label="cardboard">
   <title>box 35</title>
</container>

This would depend on 2 things:
1) That we can test and ensure that any information the user might enter into the Location field will not cause XML errors when round tripped - spaces, accents and other diacritical marks, special characters, etc.
2) If the Location field is not filled out (i.e., IF Location == "" or NULL, as in the "box 35" example above) then we do not add the ID field to the container.

I don't like this solution, but then again, the problem is with the existing interface, which does not allow for IDs to be used, for the Physical location to contain its own drop down (and thus have locations be re-used), and it confuses things by including "shelf" as a possible container in the controlled dropdown vocabulary (which rightly belongs to <physloc>). I hope in the future that we can find institutions interested in funding development for this part of AtoM, to address these problems throughout the application. In the meantime, let me know your thoughts on this proposed solution.

#25 Updated by José Raddaoui Marín over 6 years ago

Location is a freetext field, and it will create the same warning for the id attribute, for example, if it has spaces. Doesn't look like a good option to me. I can't see any wrapper for both elements, but maybe we can use the <ref> element inside <phyloc> or <container> to reference the other. But I don't know, maybe I'm talking nonsense.

#26 Updated by Dan Gillean over 6 years ago

Radda, I have posted a question about this to the EAD List-Serv, to see if the broader community has suggestions about how we can approach this while still conforming to accepted best practice. I will update the issue again soon if/when I have some feedback. View the question on the list-serv here: http://listserv.loc.gov/cgi-bin/wa?A2=ind1304&L=ead&T=0&P=1551

If we don't get any good suggestions, Jessica and I will discuss and get back to you.

#27 Updated by Dan Gillean over 6 years ago

Hi Radda,

After discussing with people on the list-serv (see:http://listserv.loc.gov/cgi-bin/wa?A1=ind1304&L=ead&T=0, all headings listed under the title "Establishing a relationship between containers and physical location"), it seems the best option to add a relationship between the two is the following:

we will add an @id to the <physloc>, and then put the same value in <container>/@parent. We were considering using the table ID from QubitRelation, but as this number is unique and only used within the context of the EAD, there is no reason to reveal table IDs to end users via our EAD (security).

Instead, we will use a simple counter. My preference would be to use a number with several leading zeros - ie, 0001 not 1. So here is an example:


<physloc id="0001">Almacén 32</physloc>
<container type="folder" parent="0001">folder title here</container>
<physloc id="0002">Almacén 32</physloc>
<container type="box" parent="0002">house in a tree</container>
<container type="box" label="cardboard">box 35</container> <!--if there's no associated physloc, we should not add @parent -->

You will note that the ID's are different for the two same physical locations. This is due to the limitations of our current GUI. There is no dropdown list for physical location, no way to reuse the same information - so a user would have to manually type out the same information (e.g., "Almacén 32") twice. Every @id must be unique in XML, so we would have to have a way to only use the <phsyloc> element 1 time and point 2 <containers> at it - but that's not really possible with the current interface.

So for now, just use a different ID for each location; if the user repeats the location, that's okay - it's already generating 2 different <physloc> elmements so adding a unique id to each should not be a problem.

Ideally, if there is no physical location information added, then the <container> should not have an empty @parent.

#28 Updated by José Raddaoui Marín over 6 years ago

Hi Dan, thanks a lot!

I made the changes and everything looks fine, just a little problem when importing. A warning appears for every id or parent attribute:

libxml error 502 on line 24 in input file: Syntax of value for attribute id of physloc is not valid
libxml error 502 on line 25 in input file: Syntax of value for attribute parent of container is not valid

The problem is that the ID have to be a valid NCName, which for example means, that the first letter can't be a number. I tried with 'id0001' instead of '0001' and the warning dissapear.

It's 'id0001' ok? Any other suggestions?

#29 Updated by Dan Gillean over 6 years ago

I worry that our users will be confused by this random ID, if they ever look at the EAD closely, so maybe this is an opportunity to make the relationship explicit.

Why don't we use "physloc" as the predicate? For example, <physloc id='physloc0001'> and <container parent='physloc0001'>

It's a bit longer, but it makes the relationship clearer. Otherwise, id is fine.

#30 Updated by Dan Gillean over 6 years ago

Also, please note (as I put it in the example in comment #27, but forgot to point it out:

Although <title> is technically allowed in <container>, its use in this context is improper, as the Chair of the Technical Subcommittee on EAD reminded us via list-serv responses. <title> is "The formal name of a work, such as a monograph, serial, or painting, listed in a finding aid." Since the <container> element can contain CDATA, we don't need it here anyways; please remove the <title> element from the EAD in this fix.

Thanks!

#31 Updated by José Raddaoui Marín over 6 years ago

  • Status changed from Feedback to QA/Review
  • % Done changed from 0 to 100

Applied in changeset atom|commit:8a538b6c336d2ee03f6906de2db5ee20fe4761ff.

#32 Updated by José Raddaoui Marín over 6 years ago

  • Status changed from QA/Review to In progress
  • % Done changed from 100 to 90

#33 Updated by José Raddaoui Marín over 6 years ago

  • Status changed from In progress to QA/Review
  • % Done changed from 90 to 100

#34 Updated by Jessica Bushey over 6 years ago

  • Status changed from QA/Review to Verified

Fixed.

Here are two examples of the physical location container for Parent and Child descriptions:

EAD for Parent Physical Location:


<did><physloc id="physloc0001">Row 2, Shelf AA</physloc>
<container type="box" label="hollinger" parent="physloc0001">NLA-031</container>
<unittitle encodinganalog="3.1.2">Summer Day Fonds</unittitle>

EAD for Child Physical Location:


<c level="series"><did><physloc id="physloc0001">Row 2, Shelf AA</physloc>
<container type="box" label="hollinger" parent="physloc0001">NLA- 032</container>
<unittitle encodinganalog="3.1.2">Photographs</unittitle>
<unitid encodinganalog="3.1.1">s01</unitid></did>

Also available in: Atom PDF