Copy POST _analyze
{
"analyzer": "whitespace",
"text": "The quick brown fox."}
POST _analyze
{
"tokenizer": "standard",
"filter": [ "lowercase", "asciifolding" ],
"text": "Is this déja vu?"}
Positions and character offsets(位置和字符偏移)
从analyze API的输出可以看出,分析器不仅将单词转换为词语,还记录了每个词语(用于短语查询或近义词查询)的顺序或相对位置,以及原始文本中每个词语的起始和结束字符的偏移量(用于突出搜索片段)。
Copy PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"std_folded": { #1
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"my_type": {
"properties": {
"my_text": {
"type": "text",
"analyzer": "std_folded" #2
}
}
}
}}
GET my_index/_analyze { #3
"analyzer": "std_folded", #4
"text": "Is this déjà vu?"}
GET my_index/_analyze { #5
"field": "my_text", #6
"text": "Is this déjà vu?"}