Feature #13096

Remove unnecessary repository and actor data from information object Elasticsearch index

Added by David Juhasz over 1 year ago. Updated 3 months ago.

Status:VerifiedStart date:06/21/2019
Priority:MediumDue date:
Assignee:-% Done:

0%

Category:Search / Browse
Target version:Release 2.6.0
Google Code Legacy ID: Tested version:2.6
Sponsored:No Requires documentation:

Description

In the archival description Elasticsearch documents, the data from related objects should be limited to the following to limit the data stored to relevant information:

Repository:
Only keep the authorized form of name of the repository

Name access points:
Keep authorized form of name, parallel names, other forms of name, etc. Remove everything else.


Related issues

Related to Access to Memory (AtoM) - Feature #10082: Improve Elasticsearch mappings for archival descriptions Verified 05/02/2016
Related to Access to Memory (AtoM) - Task #13252: Look into use of partial ES update in repository bulk upd... New 01/30/2020
Related to Access to Memory (AtoM) - Task #13273: Use Elasticsearch's "update by query API" to update relat... New 03/13/2020
Copied to Access to Memory (AtoM) - Feature #13386: Remove unnecessary data from Elasticsearch index and redu... New 06/21/2019

History

#2 Updated by David Juhasz over 1 year ago

  • Related to Feature #10082: Improve Elasticsearch mappings for archival descriptions added

#3 Updated by David Juhasz over 1 year ago

I removed the unneeded repository and name access point data from the information object document with commit c08beb

Further data removals are outstanding, but I'm out of time to work on this right now.

#4 Updated by David Juhasz over 1 year ago

  • Description updated (diff)

We do want to keep creator admin/bio history for consistency with the archival description UI which shows this datum.

#5 Updated by David Juhasz over 1 year ago

  • Description updated (diff)

Fix typos

#6 Updated by David Juhasz over 1 year ago

  • Status changed from New to In progress

#7 Updated by Dan Gillean 8 months ago

  • Related to Task #13252: Look into use of partial ES update in repository bulk updating code added

#8 Updated by José Raddaoui Marín 7 months ago

  • Related to Task #13273: Use Elasticsearch's "update by query API" to update related resources added

#9 Updated by Dan Gillean 4 months ago

  • Status changed from In progress to Verified
  • Assignee deleted (David Juhasz)

Considering this closed - if we return to it we can open a new ticket.

#10 Updated by Dan Gillean 3 months ago

  • Copied to Feature #13386: Remove unnecessary data from Elasticsearch index and reduce unnecessary re-index operations added

#11 Updated by Dan Gillean 3 months ago

  • Subject changed from Remove unnecessary data from Elasticsearch index to Remove unnecessary repository and actor data from information object Elasticsearch index

#12 Updated by Dan Gillean 3 months ago

  • Description updated (diff)

Also available in: Atom PDF