Feature #1537

Management of persistent MCP metadata needed for statistical reports

Added by Peter Van Garderen over 7 years ago. Updated almost 3 years ago.

Status:NewStart date:
Priority:LowDue date:
Assignee:Mike Cantelon% Done:

0%

Category:Data managementEstimated time:24.00 hours
Target version:-
Google Code Legacy ID:archivematica-882 Pull Request:
Sponsored:Yes Requires documentation:

Description

e.g. AIP & DIP locations info. This info will need to be backed up and maintained across release upgrades.

Likely strategy: ensure all this metadata is in one place, i.e. the MCP dbase, include cron script that does MySQL dumps and then document procedures/recommendations for getting backups off MCP server (e.g. on AIP storage?)

[g] Legacy categories: Data management

History

#1 Updated by Peter Van Garderen over 7 years ago

[g] New owner: Joseph Perry

#2 Updated by Joseph Perry over 7 years ago

I'll need to know what data needs to be preserved.

-I'm thinking a no-sql database as part of the AIP?
-Or one for each storage location?

[g] New owner: Evelyn McLellan

#3 Updated by Evelyn McLellan over 7 years ago

The data that should be preserved is:

For AIPs and DIPs:
AIP name
AIP UUID
AIP storage location
Date placed in storage
Date updated (when we have AIP versioning)
AIP size
related DIP
DIP storage location
DIP size
Date DIP uploaded

For transfer backups:
Transfer name
Transfer UUID
Transfer size
Transfer storage location
Date placed in storage

Courtney to do mock-ups so I'm making her the owner.

[g] Labels added: Component-DataManagement
[g] Labels removed: Component-Backup
[g] New owner: Courtney Mumma

#4 Updated by Courtney Mumma over 7 years ago

Additional metadata:

Accession Number to both AIP and Transfer

#5 Updated by Courtney Mumma about 7 years ago

Joseph - I think that one DB per storage location would work best, with minimal MD available for overview in the Administration tab of the Archivematica Dashboard.

See mockup of the Admin tab here: http://archivematica.org/wiki/index.php?title=Transfer_Backup_Requirements#Administration_Tab_in_Dashboard

#6 Updated by Joseph Perry about 7 years ago

As the order of storing an AIP/uploading a DIP can vary. This information should be held in the es index for archivematica 0.9. Inserted as part of the processing chain, or microservice, for uploading or storing.

In future revisions, this information should be included in the upload/store or updated through an API. === DO NOT CLOSE THIS ISSUE TILL THIS IS DONE === (bump to 1.0 once 0.9 requirements are met)

[g] New owner: Mike Cantelon

#7 Updated by Courtney Mumma about 7 years ago

  • Target version changed from Release 0.9 to Release 0.10-beta

[g] Labels added: Milestone-Release-1.0
[g] Labels removed: Milestone-Release-0.9

#8 Updated by Courtney Mumma about 7 years ago

We need requirements for Management/Statistical reports (aggregate processing reports, eg success/fail, formats, etc). Also need processes for updating ES index after tasks. Break out into separate issues.

#9 Updated by Courtney Mumma about 7 years ago

  • Subject set to Management of persistent MCP metadata needed for statistical reports

#10 Updated by Courtney Mumma over 6 years ago

  • Assignee changed from Mike Cantelon to Justin Simpson
  • Sponsored set to No

#11 Updated by Justin Simpson over 6 years ago

  • Estimated time set to 24.00

#12 Updated by Courtney Mumma over 6 years ago

Team recently discussed using the METS MD to generate these reports from stored AIPs. Evelyn and Courtney to examine and ask community what kinds of reports they'd like from the dashboard.

#13 Updated by Courtney Mumma over 6 years ago

  • Assignee changed from Justin Simpson to Evelyn McLellan

#14 Updated by Evelyn McLellan over 6 years ago

  • Assignee changed from Evelyn McLellan to Courtney Mumma

Here is the list of data to be saved with the current location in METS identified where applicable:

For AIPs:
-AIP name - In METS structMap: <div TYPE="directory" LABEL="[AIPname]-[UUID]">
-AIP UUID - In METS structMap: <div TYPE="directory" LABEL="[AIPname]-[UUID]">
-AIP storage location - Not in METS file
-Date placed in storage - Not in METS file
-Date updated (when we have AIP versioning) - will be in METS header <metsHdr CREATEDATE="2013-05-09T15:00:00" LASTMODDATE=”2014-02-09T21:00:00>
-AIP size - Not in METS file
-Related DIP - Not in METS file
-DIP storage location - Not in METS file
-DIP size - Not in METS file
-Date DIP uploaded - Not in METS file

For transfer backups:
-Transfer name <div TYPE="directory" LABEL="[Transfername]-[UUID]">
-Transfer UUID <div TYPE="directory" LABEL="[Transfername]-[UUID]">
-Transfer size - Not in METS file
-Transfer storage location - Not in METS file
-Date placed in storage - Not in METS file

#15 Updated by Evelyn McLellan over 6 years ago

Other fields to include:
-Logged-in user (should be captured as PREMIS agent)
-UUID of the Archivematica instance (should be captured as PREMIS agent)
-Possibly also environment data: what machines did Archivematica live on, what versions of all the tools were installed (already in PREMIS events), what version of Archivematica was used (already in software agent).

#16 Updated by Courtney Mumma over 6 years ago

A request for metrics has been sent out to the Archivematica and digital curation discussion groups.

A wiki page has been added for this feature / set of features: https://www.archivematica.org/wiki/Metrics_requirements

#17 Updated by Courtney Mumma over 6 years ago

  • Assignee changed from Courtney Mumma to Mike Cantelon
  • Target version changed from Release 0.10-beta to Release 1.0.0

#18 Updated by Evelyn McLellan over 6 years ago

  • Category set to Data management

#19 Updated by Courtney Mumma about 6 years ago

  • Target version changed from Release 1.0.0 to Release 1.1.0
  • Sponsored changed from No to Yes

#20 Updated by Justin Simpson over 5 years ago

  • Target version deleted (Release 1.1.0)

This functionality is at least partially provided by the Storage Service, and pointer files. I am removing from the 1.1 release queue.

#21 Updated by Justin Simpson almost 3 years ago

  • Priority changed from High to Low

Also available in: Atom PDF