Bug #5982

ISDIAH Region facet: Two-word terms are broken up into two seperate regions

Added by Dan Gillean over 8 years ago. Updated almost 8 years ago.

Status:VerifiedStart date:11/15/2013
Priority:MediumDue date:
Assignee:Jesús García Crespo% Done:

0%

Category:Repository
Target version:Release 2.1.0
Google Code Legacy ID: Tested version:
Sponsored:No Requires documentation:

Description

To reproduce
  • Create a new repository or edit an existing one
  • In the contact area, edit the contact information - in the Physical location tab of the Contact modal, enter "British Columbia"
  • Click submit on the contact modal, and save the record
  • Navigate to Browse > Archival institutions and look at the Region facet filter results.

Resulting error
2 filters appear as "british" and "columbia"

Expected result
1 filter for "british columbia" appears

This will be important to fix before any BC data is imported into 2.x! Or anywhere else with a 2 part region name.

History

#1 Updated by Dan Gillean over 8 years ago

  • Subject changed from ISDIAH Region facet: Two-word terms are broken up into two seperate regions in the Places facet filters to ISDIAH Region facet: Two-word terms are broken up into two seperate regions

#2 Updated by Tim Hutchinson over 8 years ago

Related to this, is it possible to retain the capitalization e.g. British Columbia rather than british columbia? This seems to be the only facet where that happens, but your use of "british columbia" in the issue description suggests that maybe there's a reason for that...

#3 Updated by Dan Gillean over 8 years ago

If there's a reason, I don't know what it is, Tim. I agree that we should retain the term as the user enters it.

#4 Updated by Tim Hutchinson over 8 years ago

I'm guessing the two issues are related. This reminded me of a former database where capitalization got stripped out in the index - but in this case it seems to be behaving like a keyword index. I also noticed that a hyphenated name gets split into two entries.

#5 Updated by Jesús García Crespo over 8 years ago

  • Target version changed from Release 2.0.1 to Release 2.0.2

#6 Updated by Jesús García Crespo over 8 years ago

  • Status changed from New to QA/Review

Fixed in 74fe1c7bb39ac759c9cbbb3591b059846c157090. Capitalization is also respected now. Pretty easy fix, we were running the facet against the analyzed version of the field in ElasticSearch. But the non_analyzed version is the one just containing "British Columbia". FYI: we also use non_analyzed fields for sorting and things like that, that's why we keep them.

#7 Updated by Tim Hutchinson over 8 years ago

Excellent! This will make our work on #5596 a lot easier.

#8 Updated by Dan Gillean over 8 years ago

  • Status changed from QA/Review to Verified

Seems to be behaving as expected in 2.x - Tim, the capitalization is now maintained as the user enters it as well.

#9 Updated by Dan Gillean almost 8 years ago

  • Target version changed from Release 2.0.2 to Release 2.1.0

Also available in: Atom PDF