normalizer(归一化)
keyword
fields(关键字字段) 的 normalizer(
归一化)属性与 analyzer
(
分析器)类似,只不过它保证 analysis chain(分析链)生成单一的 token(词元
).
normalizer(
归一化)应用于索引 keyword(关键字)之前,以及诸如在 match
query
(匹配查询),查询解析器搜索 keyword fields(关键字字段)的时候.
curl -XPUT 'localhost:9200/index?pretty' -H 'Content-Type: application/json' -d'
{
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"char_filter": [],
"filter": ["lowercase", "asciifolding"]
}
}
}
},
"mappings": {
"type": {
"properties": {
"foo": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
}
'
curl -XPUT 'localhost:9200/index/type/1?pretty' -H 'Content-Type: application/json' -d'
{
"foo": "BÀR"
}
'
curl -XPUT 'localhost:9200/index/type/2?pretty' -H 'Content-Type: application/json' -d'
{
"foo": "bar"
}
'
curl -XPUT 'localhost:9200/index/type/3?pretty' -H 'Content-Type: application/json' -d'
{
"foo": "baz"
}
'
curl -XPOST 'localhost:9200/index/_refresh?pretty'
curl -XGET 'localhost:9200/index/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"foo": "BAR"
}
}
}
'
上述查询与 documents(文档) 1和2相匹配,这是因为在索引和查询的时候都将 BAR
转换为了 bar
.
{
"took": $body.took,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.2876821,
"hits": [
{
"_index": "index",
"_type": "type",
"_id": "2",
"_score": 0.2876821,
"_source": {
"foo": "bar"
}
},
{
"_index": "index",
"_type": "type",
"_id": "1",
"_score": 0.2876821,
"_source": {
"foo": "BÀR"
}
}
]
}
}
此外,keywords 在索引之前转换意味着聚合返回 normalised values(归一化的值):
curl -XGET 'localhost:9200/index/_search?pretty' -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"foo_terms": {
"terms": {
"field": "foo"
}
}
}
}
'
返回
{
"took": 43,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0.0,
"hits": []
},
"aggregations": {
"foo_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "bar",
"doc_count": 2
},
{
"key": "baz",
"doc_count": 1
}
]
}
}
}
Last updated