Feature #9655

Improvements to search to better support searching indexed finding aid text

Added by Dan Gillean over 4 years ago. Updated almost 3 years ago.

Status:VerifiedStart date:01/28/2016
Priority:MediumDue date:
Assignee:-% Done:

0%

Category:Search / BrowseEstimated time:40.00 hours
Target version:Release 2.4.0
Google Code Legacy ID: Tested version:
Sponsored:Yes Requires documentation:

Description

This feature will add elements to the advanced search panel, to better support the enhancements in #9627 (allow users to upload a PDF finding aid instead of generating one locally). When a finding aid is generated in AtoM, there is no point in indexing the text, because all of it is already included in Elasticsearch (as the descriptions are indexed). However, when an externally-created finding aid is uploaded and linked to a description, this issue will give users some additional features to enhance searching:

  • Add basic search results filter for logged in users to limit to descriptions with uploaded PDFs
  • Add ability to limit search to indexed uploaded finding aid text in Advanced search
  • Add ability to limit search to just database records - exclude uploaded finding aid text from search results in Advanced search

Related issues

Related to Access to Memory (AtoM) - Feature #9700: Show finding aid links at all levels Verified 01/28/2016
Related to Access to Memory (AtoM) - Feature #9627: Allow users to upload a PDF finding aid instead of genera... Verified 01/28/2016

History

#3 Updated by José Raddaoui Marín over 4 years ago

  • Status changed from New to Code Review
  • Assignee changed from José Raddaoui Marín to Jesús García Crespo

The changes for this feature are included in PR 310

Please take a look to the latest commits and, if you think it's all right, re-assign the ticket back to me so I can add some test and documentation notes to the related tickets.

#4 Updated by Jesús García Crespo over 4 years ago

  • Status changed from Code Review to In progress
  • Assignee changed from Jesús García Crespo to José Raddaoui Marín

LGTM

#5 Updated by Dan Gillean over 4 years ago

  • Target version set to Release 2.4.0

#6 Updated by José Raddaoui Marín about 4 years ago

  • Status changed from In progress to QA/Review
  • Assignee changed from José Raddaoui Marín to Dan Gillean
  • Target version deleted (Release 2.4.0)

Please, check https://projects.artefactual.com/issues/9627#note-13 before this one.

Other notes for testing and documentation:

- Finding aid transcript is only obtained for uploaded PDFs
- It's included in all, so it should get hits from the normal search (using the search-box)
- The filter in adv. search has options to allow filtering uploaded and generated finding aids
- New filter and boolean options should also work as request parameters in the CSV export from the browse page and in the browse information objects endpoint from the REST API plugin

#7 Updated by José Raddaoui Marín about 4 years ago

  • Target version set to Release 2.4.0

#8 Updated by Dan Gillean about 4 years ago

  • Related to Feature #9700: Show finding aid links at all levels added

#9 Updated by José Raddaoui Marín about 4 years ago

  • Status changed from QA/Review to Feedback
  • Assignee changed from Dan Gillean to José Raddaoui Marín

Further feedback from Tim:

Mostly things seem to be working as expected, and I like how you’ve designed things to address what could have been complicated workflows in some cases. (Getting rid of the default “unknown” status is a nice bonus.)

A couple times a finding aid has not successfully uploaded, but there is no error in the interface so you can’t tell until you try to download it. It’s intermittent so I can’t tell you how to reproduce it (re-uploading the same finding aid is successful, so it’s not an issue with the file). In the worker log file, the upload is reported as successful but an error getting the transcript is reported:
2016-04-25 08:28:28 > Job 49281 "arFindingAidJob": Uploading finding aid (maria-green-fonds)...
2016-04-25 08:28:28 > Job 49281 "arFindingAidJob": Finding aid uploaded successfully: /var/www/html/atom-issue-9627/downloads/maria-green-fonds.pdf
2016-04-25 08:28:28 > Job 49281 "arFindingAidJob": Obtaining finding aid transcript...
2016-04-25 08:28:28 > Job 49281 "arFindingAidJob": Obtaining the transcript has failed.

Perhaps more usefully, the PDF file exists but the file size is 0 bytes. I’ve encountered this twice so far, once for a scanned finding aid and once for a PDF freshly generated by Word. And it turns out this error is not reported for a PDF with no transcript.

My other comment at this point is that I wonder if “finding aid transcript” will be meaningful to most end users. Maybe “finding aid text”?"

#10 Updated by José Raddaoui Marín about 4 years ago

  • Status changed from Feedback to Code Review
  • Assignee changed from José Raddaoui Marín to Jesús García Crespo

Hi again Sevein, I've fixed the issues reported by Tim. I tried to do all the upload in the job, but the uploaded files are deleted from the temp directory at the end of the request if they are not moved, copied or renamed before it. So we have to do the copying part synchronously before callig the job to extrat the 'transcript' (now called 'text' in the GUI) and other properties. Again, all in the same PR 310. Thanks!

#11 Updated by Jesús García Crespo about 4 years ago

  • Status changed from Code Review to In progress
  • Assignee changed from Jesús García Crespo to José Raddaoui Marín

Looking good! Thanks.

#12 Updated by Dan Gillean about 4 years ago

  • Status changed from In progress to QA/Review
  • Assignee changed from José Raddaoui Marín to Dan Gillean

#13 Updated by Dan Gillean about 4 years ago

  • Status changed from QA/Review to Verified

#15 Updated by Dan Gillean over 3 years ago

  • Description updated (diff)

(Fix ref to internal issue ticket in description)

#16 Updated by Dan Gillean over 3 years ago

  • Related to Feature #9627: Allow users to upload a PDF finding aid instead of generating one from AtoM's descriptions added

#17 Updated by Dan Gillean almost 3 years ago

  • Assignee deleted (Dan Gillean)
  • Requires documentation deleted (Yes)

Also available in: Atom PDF