Feature #13096

Remove unnecessary data from Elasticsearch index

Added by David Juhasz 4 months ago. Updated 4 months ago.

Status:In progressStart date:06/21/2019
Priority:MediumDue date:
Assignee:David Juhasz% Done:

0%

Category:Search / Browse
Target version:Release 2.6.0
Google Code Legacy ID: Tested version:2.6
Sponsored:No Requires documentation:

Description

In the archival description Elasticsearch documents, the data from related objects should be limited to the following to limit the data stored to relevant information:

Repository:
Only keep the authorized form of name of the repository

Creators:
- Keep authorized form of name, parallel names, other forms of name, etc.
- Keep admin/bio history
- Remove everything else

Name access points:
Keep authorized form of name, parallel names, other forms of name, etc. Remove everything else.

Subjects, places, genres:
Keep only name, remove other fields (descriptions, etc)

Physical storage/objects
Remove all fields

Part of
Remove all fields


Related issues

Related to Access to Memory (AtoM) - Feature #10082: Improve Elasticsearch mappings for archival descriptions Verified 05/02/2016

History

#2 Updated by David Juhasz 4 months ago

  • Related to Feature #10082: Improve Elasticsearch mappings for archival descriptions added

#3 Updated by David Juhasz 4 months ago

I removed the unneeded repository and name access point data from the information object document with commit c08beb

Further data removals are outstanding, but I'm out of time to work on this right now.

#4 Updated by David Juhasz 4 months ago

  • Description updated (diff)

We do want to keep creator admin/bio history for consistency with the archival description UI which shows this datum.

#5 Updated by David Juhasz 4 months ago

  • Description updated (diff)

Fix typos

#6 Updated by David Juhasz 4 months ago

  • Status changed from New to In progress

Also available in: Atom PDF