Feature #2862

Merge imported files to existing records

Added by Peter Van Garderen almost 12 years ago. Updated about 5 years ago.

Status:NewStart date:04/01/2009
Priority:MediumDue date:
Assignee:-% Done:


Category:-Estimated time:100.00 hours
Target version:-
Sponsored:No Tested version:


This is a new feature that would take a XML import file and see whether the
record(s) it contains are already in the dbase. If so, it assumes that the
XML import contains updates/changes that must be merged to the existing
record(s) rather than saving the imported file as a new record(s). This new
feature is related to the ability to save previous versions of a record
(see #2861). The newly merged record would be saved as the next
version of the record so that the import merging can be reviewed by an
authorized user.

[g] Legacy categories: Import/Export

Related issues

Duplicated by Access to Memory (AtoM) - Feature #3873: Allow update of existing descriptions via EAD Duplicate


#1 Updated by David Juhasz almost 11 years ago

  • Priority changed from High to Low

How can you determine whether a record is already in the database? Is there some unique field that can guarantee this, or would the user need to manually compare values?

[g] Labels added: Priority-Low, Milestone-Release-Post-1.2
[g] Labels removed: Priority-High, Milestone-Release-1.1

#3 Updated by Anonymous over 10 years ago

Two example use cases we need at Ghent University:

I have an archival description and want to add items. I export the EAD file. This file contains identifiers to reingest the EAD and update the record.
I have an archival description with an unique identifier. A programmer can prepare a batch of EAD files using the unique identifier. The programmer issues a batch upload. The items are added to the archival description using the unique identifier.

Both use cases require no manual input by a user.

#4 Updated by David Juhasz about 10 years ago

  • Assignee deleted (Anonymous)

#5 Updated by Anonymous almost 10 years ago

  • Target version set to Release 1.3

[g] Labels added: Milestone-Release-1.3

#6 Updated by Anonymous about 9 years ago

  • Target version changed from Release 1.3 to Release 2.1.0

In EAD we will merge if we find <arch desc>ID matches a description identifier. Otherwise we will create a new record.

[CCAD-34: Routine maintenance of imports]

[g] Labels added: Milestone-Release-2.0
[g] Labels removed: Component-Versioning, Milestone-Release-1.3
[g] New owner: David Juhasz

#7 Updated by Tim Hutchinson about 9 years ago

I'm glad to see this is being addressed but I'd like to suggest a refinement:
- on the EAD side, I'd use <eadid identifier="xxx">. <archdesc id="xxx"> should really map to the main identifier. This field (identifier) is also not used consistently at least in Canadian practice, since it's not in RAD - so it's not necessarily stable, if it's populated at all.
- ideally, I think the field used in ICA-AtoM shouldn't be editable (e.g. a new field sourceID or something). But if this is used for periodic feeds (e.g. from provincial networks into Archives Canada), maybe that's not a critical issue.

This way, <eadid identifier=xxx> can be used for a unique/system identifier if there is one (or an appropriate combination of fields generated by the contributing system); and then <archdesc id=xxx> can be used for the information actually needed for the end user, which could be edited if necessary.

#8 Updated by Tim Hutchinson about 9 years ago

I just realized my explanation above is partly inaccurate - I was mixing up <archdesc id="xxx"> and <did><unitid>. <unitid> is the element that maps to the main identifier. However, I would still argue that <eadid identifier="xxx"> is a better location for the unique identifier to be used for matching and merging. <archdesc id="xxx"> is in fact intended to be used as a linking attribute, and technically it only needs to be unique within a given EAD instance.

#9 Updated by David Juhasz about 9 years ago

Reassign to David's new account.

[g] New owner: David Juhasz

#10 Updated by Jessica Bushey over 8 years ago

In the EAD header we are using:
<eadid url=" " encodinganalog="Identifier">XXXX</eadid>

In the EAD <archdesc level> we are using:
<unitid> countrycode" " encodinganalog="3.1.1">XXXX</unitid>

#11 Updated by Dan Gillean over 8 years ago

  • Priority changed from Low to Medium
  • Start date set to 04/01/2009

#12 Updated by Jesús García Crespo over 8 years ago

  • Tracker changed from Bug to Feature
  • Category set to Import/Export
  • Target version deleted (Release 2.1.0)
  • Estimated time set to 100.00
  • Sponsored set to No

#13 Updated by Tim Hutchinson over 7 years ago

I was just noticing that the opening description links to the wrong issue, i.e. it uses the qubit number in the current system. The correct issue (re saving previous versions of records) is #2861.

That said, I don't think #2861 is necessarily a prerequisite for the current issue. For any import, you're going to need to test carefully before doing it in production, so I'm not sure having the ability to revert to an earlier record is a deal breaker.

It would be great to see this one developed, but clearly it's not simple :)

#14 Updated by Dan Gillean over 7 years ago

  • Description updated (diff)

Thanks Tim; I've updated the link in the issue description. Hopefully we can consider such a development some time soon!

#15 Updated by Dan Gillean about 6 years ago

  • Project changed from Access to Memory (AtoM) to AtoM Wishlist
  • Category deleted (Import/Export)

Moving to AtoM Wishlist subproject until sponsored for inclusion or taken on by Artefactual developers.

#16 Updated by David Juhasz about 5 years ago

  • Assignee deleted (David Juhasz)

Also available in: Atom PDF