Feature #6007

Add fulltext search to accessions search bar in 2.x

Added by Dan Gillean over 8 years ago. Updated almost 7 years ago.

Status:VerifiedStart date:11/21/2013
Priority:HighDue date:
Assignee:José Raddaoui Marín% Done:

100%

Category:AccessionsEstimated time:6.00 hours
Target version:Release 2.1.0
Google Code Legacy ID: Tested version:
Sponsored:Yes Requires documentation:

Description

After the move to using ES in 2.x, a search bar was added back to the Accessions module. However, at the moment it is currently only returning results on the identifier. We need to return this feature to its previous functionality at least.

Additionally, see comments on issue #4271 - when fulltext searching was implemented in 1.3. CVA and SFU both had suggestions on improvements to search functionality for accessions. Fulltext searching on accessions is a key feature to many of our users considering upgrading to 2.x

To get this feature included into AtoM, the suggestion is, for now, to implement this as it was implemented in 1.3. With more time, for a future major release (and hopefully pending support for development), we can consider adding additional features/improvements to the accessions search on a different issue ticket.


Related issues

Related to Access to Memory (AtoM) - Bug #4271: Add fulltext search for accessions Verified

History

#1 Updated by Dan Gillean over 8 years ago

Comment from client supporting feature inclusion: "Should also include processing note(accession_i18n.processing_notes) in the list of searchable fields." See related support ticket.

#2 Updated by Jesús García Crespo over 8 years ago

  • Status changed from New to QA/Review

#3 Updated by Dan Gillean over 8 years ago

  • Status changed from QA/Review to Feedback

Initial testing on 2x failed to return results from the expected fields. I could get results from immediate source of acquisition and location information, but not from any of the fields in the Administration area, nor on the donor name.

It's possible that the test site just needs to have search index rebuilt, and the cache cleared. I tested on 2 different browsers; pretty sure it's not a local browser cache issue.

#4 Updated by Jesús García Crespo over 8 years ago

  • Target version changed from Release 2.0.1 to Release 2.1.0

#5 Updated by Jesús García Crespo over 8 years ago

  • Assignee changed from Jesús García Crespo to José Raddaoui Marín
  • Estimated time set to 6.00

Dan, you are right. The piece of code responsible of the search just looking at the i18n fields, and it's just using SQL. We should be adding accesions to ElasticSearch. Radda has done that before, so he should be able to do it.

#6 Updated by José Raddaoui Marín over 8 years ago

  • Status changed from Feedback to QA/Review
  • Assignee changed from José Raddaoui Marín to Dan Gillean
  • % Done changed from 0 to 100

#7 Updated by Dan Gillean over 8 years ago

  • Status changed from QA/Review to Feedback
  • Assignee changed from Dan Gillean to José Raddaoui Marín

Hi Radda,

having tested on the dev branch, I got results for every field except: donor name, primary contact name, contact information (for donor), and creator name. While I feel like the contact information for a donor is less important in an accessions search, the sponsoring client has specifically listed searching on Donor name as being of "high importance" - see #6011. I know this is being pulled in from the related donor record, but I'm hoping there's a way we can add hits to this accession search.

I did not test to see if the search results are being weighted in any way, but ideally, matches on accession number, title, donor name, and scope/content would be more heavily weighted - again, see #6011. I don't know how complex it is to weight searches - if it is really complex, it will have to wait until we can do the same with all the searches in AtoM - but if it's as simple as adding/changing some values in ES, we should try to create a weighting that reflects the order of importance listed in #6011.

Thanks!

#8 Updated by José Raddaoui Marín over 8 years ago

Hi Dan,

I forgot to tell you that you will need to rebuild the search index. After doing that you'll get results from the donor name. I've not added the primary contact name, the contact information (for donor) and the creator name to the search index because they weren't specified in #6011. Please, let me know if I shoud do it.

The fields for the query and their weight are:

'identifier^10'
'donorsName^10'
'i18n.'.$culture.'.title^10'
'i18n.'.$culture.'.scopeAndContent^10'
'i18n.'.$culture.'.locationInformation^5'
'i18n.'.$culture.'.processingNotes^5'
'i18n.'.$culture.'.sourceOfAcquisition^5'
'i18n.'.$culture.'.archivalHistory^5'
'i18n.'.$culture.'.appraisal'
'i18n.'.$culture.'.physicalCharacteristics'
'i18n.'.$culture.'.receivedExtentUnits'

The sort option was making the results not being weighted, so I've added 'Relevancy' as the default option for the order.

Regards.

#9 Updated by Dan Gillean over 8 years ago

Radda, this has been merged into 2.x, yes? I've tested it there, and it seems to be working.

I would suggest that adding the creator name, and the primary contact name, would be the most important ones to add to the index, if possible - don't worry about the donor contact information since it's not in scope. However, if this will take a lot of time, we can let it go - it hasn't been listed as a priority in the related issue, as you pointed out.

Thanks!

#10 Updated by José Raddaoui Marín over 8 years ago

  • Status changed from Feedback to QA/Review

Hi Dan,

This is now included in 2.x: AtoM|commit: aad127a793075e3d0b112fea10bcca122f77fee8

After rebuilding the search index you'll get results from creator and primary contact names too. I've also fixed the relevance option based on your comments in #5743.

#11 Updated by José Raddaoui Marín over 8 years ago

  • Assignee changed from José Raddaoui Marín to Austin Trask

Hi Austin, please update the search index for 2x.test.artefactual.com again. Thanks ;)

#12 Updated by David Juhasz over 8 years ago

  • Assignee changed from Austin Trask to David Juhasz

I updated the search index on 2x.test today. Update completed around 12:40 PST

#13 Updated by David Juhasz over 8 years ago

  • Assignee changed from David Juhasz to Dan Gillean

#14 Updated by Dan Gillean over 8 years ago

  • Status changed from QA/Review to Verified
  • Assignee changed from Dan Gillean to José Raddaoui Marín

Nice Radda, looks good. Thanks!

#15 Updated by Dan Gillean over 7 years ago

  • Sponsored changed from No to Yes

#16 Updated by Dan Gillean almost 7 years ago

For reference, as of 2.2: Here is a list of all current searchable fields, via the accessions search box:

* id
* slug
* identifier
* date
* created_at
* updated_at
* i18n.%LANG%.appraisal
* i18n.%LANG%.archival_history
* i18n.%LANG%.location_information
* i18n.%LANG%.physical_characteristics
* i18n.%LANG%.processing_notes
* i18n.%LANG%.received_extent_units
* i18n.%LANG%.scope_and_content
* i18n.%LANG%.source_of_acquisition
* i18n.%LANG%.title

* donors.id
* donors.slug
* donors.contact_informations.contact_person
* donors.contact_informations.street_address
* donors.contact_informations.postal_code
* donors.contact_informations.country_code
* donors.contact_informations.location.lon
* donors.contact_informations.location.lat
* donors.contact_informations.i18n.%LANG%.contact_type
* donors.contact_informations.i18n.%LANG%.city
* donors.contact_informations.i18n.%LANG%.region
* donors.contact_informations.i18n.%LANG%.note
* donors.i18n.%LANG%.authorized_form_of_name
* donors.i18n.%LANG%.dates_of_existence
* donors.i18n.%LANG%.history
* donors.i18n.%LANG%.places
* donors.i18n.%LANG%.legal_status
* donors.i18n.%LANG%.functions
* donors.i18n.%LANG%.mandates
* donors.i18n.%LANG%.internal_structures
* donors.i18n.%LANG%.general_context
* donors.i18n.%LANG%.institution_responsible_identifier
* donors.i18n.%LANG%.rules
* donors.i18n.%LANG%.sources
* donors.i18n.%LANG%.revision_history

* creators.id
* creators.slug
* creators.created_at
* creators.updated_at
* creators.description_identifier
* creators.corporate_body_identifiers
* creators.entity_type_id
* creators.other_names.i18n.%LANG%.name
* creators.other_names.i18n.%LANG%.note
* creators.other_names.i18n.%LANG%.dates
* creators.parallel_names.i18n.%LANG%.name
* creators.parallel_names.i18n.%LANG%.note
* creators.parallel_names.i18n.%LANG%.dates
* creators.standardized_names.i18n.%LANG%.name
* creators.standardized_names.i18n.%LANG%.note
* creators.standardized_names.i18n.%LANG%.dates
* creators.i18n.%LANG%.authorized_form_of_name
* creators.i18n.%LANG%.dates_of_existence
* creators.i18n.%LANG%.history
* creators.i18n.%LANG%.places
* creators.i18n.%LANG%.legal_status
* creators.i18n.%LANG%.functions
* creators.i18n.%LANG%.mandates
* creators.i18n.%LANG%.internal_structures
* creators.i18n.%LANG%.general_context
* creators.i18n.%LANG%.institution_responsible_identifier
* creators.i18n.%LANG%.rules
* creators.i18n.%LANG%.sources
* creators.i18n.%LANG%.revision_history

NOTE: Not all of these have been tested yet. Weighting is not accounted for either.

#17 Updated by Dan Gillean almost 7 years ago

OOPS - important update to the above list: Any field expressed with underscores needs to be changed to camelCase. Example:

"creators.i18n.%LANG%.internal_structures" is actually "creators.i18n.%LANG%.internalStructures"

Also, when you see LANG, it means that the i18n ISO language code must be inserted for the culture you are searching in. E.g. "creators.i18n.en.internalStructures" to search in English.

Also available in: Atom PDF