Improve ES anlyzers for a better work with diacritics
|Assignee:||Dan Gillean||% Done:|
|Category:||Search / Browse|
|Target version:||Release 2.3.0|
|Google Code Legacy ID:||Tested version:|
1. The researches give different results when we use accents/diacritics and when we don't.
For example : researches for "évaluation" and "evaluation" / "hôtel" and "hotel" / "déjà" and "deja" won't give the same results.
Would it be possible to change the catalogue so it won't take care of the diacritic marks / accents when we do some researches?
2. When we search for a word like "évaluation", we won't get in the results all the descriptions including "l'évaluation" or "d'évaluation".
Would it be possible to correct this so we find all the results (including descriptions with "l'évaluation" and "d'évaluation"), when entering "évaluation" in the search menu?
This is a problem for all the words beginning with a-e-i-o-u-y, since we often use them with "l'" or "d'" in french. The sign ' should be considered as a separation between two words.
#10 Updated by José Raddaoui Marín almost 7 years ago
I've added the asciifolding filter to the ES default analyzer to make it work with non i18n fields:
The search index needs to be rebuilt again.