Feature #1889
Add underscores and periods as tokenizers to value list in ElasticSearch
Status: | New | Start date: | ||
---|---|---|---|---|
Priority: | Low | Due date: | ||
Assignee: | Mike Cantelon | % Done: | 0% | |
Category: | Index/Search | |||
Target version: | - | |||
Google Code Legacy ID: | archivematica-1234 | Pull Request: | ||
Sponsored: | No | Requires documentation: |
Description
See discussion at https://groups.google.com/forum/?hl=en&fromgroups=#!topic/archivematica/u1TSa2i9BJM .
[g] Legacy categories: Search
History
#1 Updated by Mike Cantelon over 8 years ago
Yeah, mentioning to users that they can do wildcard searches might be good rather than changing the tokenizer.
If a user enters "*master*" or "Members_Master2009.xls" now they'd end up seeing the entry for Members_Master2009.xls. If we change the tokenizer so it splits words up by underscores, etc., we might lose the ability to search for "Members_Master2009.xls" precisely.
I've posted a question online to see if anyone has any tokenizer advice on how we could tokenize "Members_Master_2009.xls" into "members", "master", and "Members_Master_2009.xml".
#3 Updated by Mike Cantelon over 8 years ago
- Target version changed from Release 0.10-beta to Release 1.0.0
#4 Updated by Evelyn McLellan about 8 years ago
- Category set to Index/Search
#5 Updated by Mike Cantelon almost 8 years ago
Updating analysis then testing...
curl -XPOST 'http://192.168.1.70:9200/aips/_close'
curl -XPUT 'http://192.168.1.70:9200/aips/_settings' -d '{
"index": {
"analysis" : {
"analyzer": {
"default": {
"tokenizer" : "standard",
"filter" : ["preserve_hyphens_filter", "lowercase", "stop"]
},
"filter" : {
"preserve_hyphens_filter" : {
"type" : "word_delimiter",
"generate_word_parts": false,
"catenate_words": true
}
}
}
}
}
}'
curl -XPOST 'http://192.168.1.70:9200/aips/_open'
curl -XGET '192.168.1.70:9200/aips/_analyze?field=msg&pretty=1' -d "Run to the hills and rock-around-town."
#6 Updated by Courtney Mumma over 7 years ago
- Target version changed from Release 1.0.0 to Release 1.1.0
#7 Updated by Justin Simpson about 7 years ago
- Target version deleted (
Release 1.1.0)
We do not have advanced search functionality on the Archivematica roadmap at the present time. We would like to provide a better search interface, at some point, I am moving this ticket out of any targetted version queues until the feature can have requirements generated.
#8 Updated by Justin Simpson over 4 years ago
- Priority changed from High to Low