Bug #11999

Avoid pagination over 10,000 records in ES queries - search/browse result pages over 1,000 lead to ES error

Added by José Raddaoui Marín about 1 year ago. Updated 4 days ago.

Status:VerifiedStart date:02/26/2018
Priority:MediumDue date:
Assignee:-% Done:

0%

Category:Search / Browse
Target version:Release 2.5.0
Google Code Legacy ID: Tested version:2.5
Sponsored:No Requires documentation:

Description

Elasticsearch introduced a new setting in 2.x to avoid memory issues in deep pagination:

index.max_result_window

The maximum value of from + size for searches to this index. Defaults to 10000. Search requests take heap memory and time proportional to from + size and this limits that memory. See Scroll or Search After for a more efficient alternative to raising this.

In the AtoM 2.5+ release, we have upgraded to ES 5.x (see issues #10847 and #11567), so we've now inherited this issue.

This means that for uses with 10,000+ records in an AtoM installation, trying to navigate to any page above 1,000 in the search/browse results (when the results per page setting is set to the default value of 10) will lead to an Elasticsearch error.

Using the Scroll or Search After APIs are not options for us because they don't allow to go to an intermediate page and the Scroll API doesn't return aggregations.

Increasing the index.max_result_window value also increases the cost of the queries and we have instances with over a million descriptions.

Therefore, we'll limit the pagination in ES queries to 10000 records, showing a note about the issue in the pages over that limit. To ensure that users can still access those results if needed, we will also add the ability to specify the sort order in the sort button - for more details see subtask #12000.

See also the following notes from the Elastic team about this issue:

https://www.elastic.co/guide/en/elasticsearch/guide/current/pagination.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/_fetch_phase.html

pager-results-redirect.png (151 KB) Dan Gillean, 03/14/2018 02:42 PM

pager-results-redirect-02.png (69.4 KB) Dan Gillean, 03/14/2018 02:42 PM


Related issues

Related to Access to Memory (AtoM) - Feature #12000: Add alternative sort directions to the sort button in sea... Verified 02/26/2018

History

#1 Updated by José Raddaoui Marín about 1 year ago

  • Copied to Feature #12000: Add alternative sort directions to the sort button in search/browse pages added

#2 Updated by José Raddaoui Marín about 1 year ago

  • Copied to deleted (Feature #12000: Add alternative sort directions to the sort button in search/browse pages)

#3 Updated by José Raddaoui Marín about 1 year ago

  • Related to Feature #12000: Add alternative sort directions to the sort button in search/browse pages added

#4 Updated by Dan Gillean about 1 year ago

  • Subject changed from Avoid pagination over 10000 records in ES queries to Avoid pagination over 10,000 records in ES queries - search/browse result pages over 1,000 lead to ES error
  • Description updated (diff)

#5 Updated by Dan Gillean about 1 year ago

  • Description updated (diff)

#6 Updated by José Raddaoui Marín about 1 year ago

Places using ES pagination and affected by this issue:

- IO browse page
- Actor browse page
- Repository browse page
- Accessions browse page
- Clipboard
- Move action
- Description updates
- Actor index page (related descriptions list)
- Repository index page (holdings)
- Repository index page (maintained actors)
- IO inventory report
- Taxonomy index page
- Term index page (IO results)
- Term index page (terms list)
- API: IO browse endpoint

#7 Updated by José Raddaoui Marín about 1 year ago

  • Status changed from New to In progress

#8 Updated by José Raddaoui Marín about 1 year ago

  • Status changed from In progress to Code Review
  • Assignee changed from José Raddaoui Marín to Nick Wilkinson

Ready for code review.

#9 Updated by Nick Wilkinson about 1 year ago

  • Assignee changed from Nick Wilkinson to Steve Breker

Hi Steve, can you please take a look for CR?

#10 Updated by Steve Breker about 1 year ago

  • Status changed from Code Review to Feedback
  • Assignee changed from Steve Breker to José Raddaoui Marín

CR looks great - approved!

#11 Updated by José Raddaoui Marín about 1 year ago

  • Status changed from Feedback to QA/Review
  • Assignee changed from José Raddaoui Marín to Dan Gillean

Merged in qa/2.5.x

#12 Updated by Dan Gillean about 1 year ago

Looks good! Attaching a couple images of the notification shown when redirected back to the first page of results, for reference and for use in the docs.

#13 Updated by José Raddaoui Marín about 1 year ago

A small note about the picks. The sort options appear in the wrong order, which may be happening because the CSS file in the arDominionPlugin was not re-built or a browser cache issue.

#14 Updated by Dan Gillean 4 days ago

  • Assignee deleted (Dan Gillean)
  • Requires documentation deleted (Yes)

Also available in: Atom PDF