Feature #9995

Enhance import matching behaviors: add a digital object checksum column to CSV import/export templates and use it on import for first match when importing updates

Added by Dan Gillean about 4 years ago. Updated over 2 years ago.

Status:VerifiedStart date:06/10/2016
Priority:MediumDue date:
Assignee:Dan Gillean% Done:

0%

Category:Import/ExportEstimated time:12.00 hours
Target version:Release 2.4.0
Google Code Legacy ID: Tested version:
Sponsored:Yes Requires documentation:

Description

Since the 2.0 release, AtoM uses the digital object's SHA256 checksum to generate the digital object URLs (e.g. http://example.com/uploads/r/my-repository/0/9/9/0956e36f27cdb64cbfee7a92a262a140e9794d66e7ad846a3dc6df0c9d52017e/my-image.jpg).

While this makes for some long URLs, it also gives us a very convenient way to tell if a digital object has been updated - if the checksum has changed, it's a new digital object.

This feature will make a few changes - first a new digital object checksum column will be included on CSV imports and exports. This includes adding this column to the sample templates found in lib/task/import/example.

Second, we will use this checksum during the import process to avoid downloading and generating new derivatives for a digital object that hasn't changed since the last import (when updating existing descriptions during import).

When no checksum value is included during import, or the checksum does not match the existing one, the digital object will be assumed to be new - and therefore part of the update. In this case, the original digital object will be removed, and the new path to the associated digital object will be used to recreate and attach a new digital object (including generating new derivatives, etc).


Related issues

Related to Access to Memory (AtoM) - Feature #10144: Enhance import matching behaviors: Add ability to limit ... Verified 06/10/2016
Blocked by Access to Memory (AtoM) - Bug #10085: CSV import: digital object URLs cannot be imported, produ... Verified 06/28/2016

History

#2 Updated by Steve Breker about 4 years ago

  • Status changed from New to QA/Review
  • Assignee changed from Steve Breker to Dan Gillean

#3 Updated by Dan Gillean about 4 years ago

  • Blocked by Bug #10085: CSV import: digital object URLs cannot be imported, produce a "could not resolve host" message in the console added

#4 Updated by Dan Gillean about 4 years ago

  • Status changed from QA/Review to Feedback
  • Assignee changed from Dan Gillean to Steve Breker

See related issue #10085

#5 Updated by Steve Breker about 4 years ago

  • Status changed from Feedback to QA/Review
  • Assignee changed from Steve Breker to Dan Gillean

#6 Updated by Dan Gillean almost 4 years ago

  • Related to Feature #10144: Enhance import matching behaviors: Add ability to limit matching to a specific repository or top-level description added

#7 Updated by Dan Gillean almost 4 years ago

  • Target version changed from Release 2.4.0 to Release 2.5.0

#8 Updated by Dan Gillean over 3 years ago

  • Target version changed from Release 2.5.0 to Release 2.4.0

#9 Updated by Dan Gillean almost 3 years ago

  • Status changed from QA/Review to Verified

#10 Updated by Dan Gillean over 2 years ago

  • Requires documentation deleted (Yes)

Also available in: Atom PDF