I have a keyword field of type Array that is generated on the creation of object. What tokenizer should I use for indexing? I couldn’t find the information on elasticsearch.org.
keyword value (array):
['george', 'apple', 'eats', 'new', 'york']
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
It all depends on your data and what you want to with it. For example, can a keyword be composed of multiple words? If so, do you want a single word to match or not while searching? Also, do you want it to be case-sensitive or not?
If you want to have only exact matches, case-sensitive, you don’t even need to analyze the field and you can configure it as
index: not_analyzedin your mapping.If you don’t want it to be case-sensitive you can analyze it and use the keyword tokenizer which does no tokenization and the lowercase token filter.
If a keyword can be composed of more than one words and you want every single word to match you need to tokenize it, for example using the whitespace tokenizer or even the default standard analyzer.