Task #10506

Improve actor matching on descriptions import

Added by José Raddaoui Marín over 3 years ago. Updated almost 3 years ago.

Status:VerifiedStart date:11/02/2016
Priority:MediumDue date:
Assignee:Dan Gillean% Done:

0%

Category:Import/Export
Target version:Release 2.4.0
Google Code Legacy ID: Tested version:
Sponsored:Yes Requires documentation:

Description

After the changes introduced in #8642 and #10144, the actor matching on descriptions import should work as described in the following scenarios:

Scenario 1(A):

GIVEN we are importing an archival description CSV file (ISAD or RAD) with the import options "no matching, create new" or "on match, delete and re-add"
WHEN the CSV "eventActor" column value matches an actor_i18n.authorized_form_of_name value
AND the CSV "repository" column value matches the controlling repository authorized_form_of_name
AND the CSV "adminHistory" column value DOES NOT match the actor_i18n.admin_history value
THEN a new actor should be created and linked to the description with actor_i18n.admin_history = the import column "adminHistory"

Summary: If you try to import new records or update existing ones using delete and replace and there’s a match on authority record name AND controlling institution BUT NOT on the admin/bio history, AtoM will create a new authority record, instead of overwriting the match’s existing history. If you wanted to update the existing admin/bio history instead, use --update=”match-and-update” instead.

Scenario 1(B):

GIVEN we are importing an archival description CSV file (ISAD or RAD) with the option "update on match"
WHEN the CSV "eventActor" column value matches an actor_i18n.authorized_form_of_name value
AND the CSV "repository" column value matches the controlling repository authorized_form_of_name
AND the CSV "adminHistory" column value DOES NOT match the actor_i18n.admin_history value
THEN the existing actor_i18n.admin_history should be updated to match the import column "adminHistory"

Summary: if you’re trying to update existing descriptions using “match-and-update” and there’s a match on authority record name AND controlling institution BUT NOT on the admin/bio history, then AtoM will update the current admin/bio history. If you don’t want this to happen, you can either exclude the admin/bio history (no changes will be made to the linked authority) or use the --update=”delete-and-replace” option instead, in which case a new authority record will be created.

Scenario 2:

GIVEN we are importing an archival description CSV file (ISAD or RAD) with all update options
WHEN the CSV "eventActor" column value matches an actor_i18n.authorized_form_of_name value
AND the CSV "repository" column value matches the controlling repository authorized_form_of_name
AND the CSV "adminHistory" column value is blank OR matches the actor_i18n.admin_history value
THEN the actor should be considered a match for linking but NO actor data should be updated in the linked authority record

Summary: If you’re importing new records or using any of the update options, and there’s a match on authority record name AND controlling institution AND the admin/history column has either been left blank or is also an exact match, then AtoM will link to the existing authority record without updating it.

Scenario 3:

GIVEN we are importing an archival description CSV file (ISAD or RAD)
WHEN the CSV "eventActor" column value matches an actor_i18n.authorized_form_of_name value
AND the CSV "repository" column value DOES NOT match the controlling repository authorized_form_of_name
AND the CSV "adminHistory" column value DOES NOT match the actor_i18n.admin_history value
THEN a new actor should be created and linked to the description with actor_i18n.admin_history = the import column "adminHistory"

Summary: If you’re importing new records and there’s a match on an authority record’s name BUT neither the repository OR the admin/biog history matches, then a new authority record will be created (to avoid overwriting another institution’s history). If you want to link to the existing authority record, omit the history from your CSV import or make it match exactly the current one.

Scenario 4:

GIVEN we are importing an archival description CSV file (ISAD or RAD)
WHEN the CSV "eventActor" column value matches an actor_i18n.authorized_form_of_name value
AND the CSV "repository" column value DOES NOT match the controlling repository authorized_form_of_name
AND the CSV "adminHistory" column value is blank OR matches the actor_i18n.admin_history value
THEN the actor should be considered a match for linking but NO actor data should be updated in the linked authority record

Summary: If you’re importing new descriptions and there’s a match on an authority record’s name AND the history is either blank or also matches exactly on the existing authority record, BUT the repository does NOT match, AtoM will link to the existing authority record without making any changes to it.


Related issues

Related to Access to Memory (AtoM) - Feature #8642: Add relation rows to associate actors with repositories Verified 07/03/2015
Related to Access to Memory (AtoM) - Feature #10144: Enhance import matching behaviors: Add ability to limit ... Verified 06/10/2016
Related to Access to Memory (AtoM) - Task #8641: Remove fetching existing actors by repository Verified 07/03/2015
Duplicated by Access to Memory (AtoM) - Task #8904: Make CSV import task's actors matching as robust as EAD Duplicate 09/02/2015

History

#1 Updated by José Raddaoui Marín over 3 years ago

  • Related to Feature #8642: Add relation rows to associate actors with repositories added

#2 Updated by José Raddaoui Marín over 3 years ago

  • Related to Feature #10144: Enhance import matching behaviors: Add ability to limit matching to a specific repository or top-level description added

#3 Updated by José Raddaoui Marín over 3 years ago

  • Related to Task #8641: Remove fetching existing actors by repository added

#4 Updated by José Raddaoui Marín over 3 years ago

  • Requires documentation set to Yes

#5 Updated by José Raddaoui Marín over 3 years ago

For XML import, where only "delete-and-replace" can be set as an update option, and other places that call the setActorByName() method from the information objects; if the actor history is populated on import it will try to get an actor match on the auth. form of name and history or it will create a new one, otherwise it will only try to get a match on the auth. form of name. There is no need to check the maintaining repository relation as the actor history should only be updated if the maintaining repository matches the information object repository when the update option is set to "match-and-update"

#6 Updated by José Raddaoui Marín over 3 years ago

  • Status changed from In progress to Code Review
  • Assignee changed from José Raddaoui Marín to Mike Gale

#7 Updated by Mike Gale over 3 years ago

  • Assignee changed from Mike Gale to José Raddaoui Marín

wow, lots of cases to take into account! The code looks good to me

#8 Updated by José Raddaoui Marín over 3 years ago

  • Status changed from Code Review to QA/Review
  • Assignee changed from José Raddaoui Marín to Dan Gillean

Merged in qa/2.4.x

#10 Updated by Dan Gillean about 3 years ago

CSV import documentation updated as part of: https://github.com/artefactual/atom-docs/commit/94ad54befffd1d803bd78793a99ced61474fa764

Still need to test and update documentation for XML imports.

#11 Updated by Dan Gillean about 3 years ago

  • Duplicated by Task #8904: Make CSV import task's actors matching as robust as EAD added

#12 Updated by Dan Gillean almost 3 years ago

  • Status changed from QA/Review to Verified
  • Requires documentation deleted (Yes)

Also available in: Atom PDF