field索引两次来解决字符串排序

如果对一个string field进行排序，结果往往不准确，因为比如"test my elasticsearch "分词后是多个单词，再排序就不是我们想要的结果了 ,有可能的出来的就是分词后某一个单词的评分高，导致排在前面。而我们是想"test my elasticsearch "整个字符串的搜索结果排在最前面。

通常解决方案是，将一个string field建立两次索引

例子：

PUT /website 
{
  "mappings": {
    "article": {
      "properties": {
        "title": {
          "type": "text",
          "fields": {
            "raw": {
              "type": "string",
              "index": "not_analyzed"
            }
          },
          "fielddata": true
        },
        "content": {
          "type": "text"
        },
        "post_date": {
          "type": "date"
        },
        "author_id": {
          "type": "long"
        }
      }
    }
  }
}

另外再建立一个raw索引字段的index的参数是not_analyzed，不分词

fielddata的true正排索引才可以进行排序

 "title": {
          "type": "text",
          "fields": {
            "raw": {
              "type": "string",
              "index": "not_analyzed"
            }
          },
          "fielddata": true
        }

插入数据

PUT /website/article/1
{
  "title": "first article",
  "content": "this is my first article",
  "post_date": "2017-01-01",
  "author_id": 110
}

PUT /website/article/2
{
  "title": "second article",
  "content": "this is my second article",
  "post_date": "2017-02-01",
  "author_id": 110
}

PUT /website/article/3
{
  "title": "third article",
  "content": "this is my third article",
  "post_date": "2017-03-01",
  "author_id": 110
}

GET /website1/article/_search
{
  "query": {
    "match_all": {}
  }
}

对比

GET /website/article/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "title.raw": {
        "order": "desc"
      }
    }
  ]
}

Previous搜索结果的排序规则 Next使用scoll滚动搜索

Last updated 6 years ago

Was this helpful?

PUT /website { "mappings": { "article": { "properties": { "title": { "type": "text", "fields": { "raw": { "type": "string", "index": "not_analyzed" } }, "fielddata": true }, "content": { "type": "text" }, "post_date": { "type": "date" }, "author_id": { "type": "long" } } } } }

"title": { "type": "text", "fields": { "raw": { "type": "string", "index": "not_analyzed" } }, "fielddata": true }

PUT /website/article/1 { "title": "first article", "content": "this is my first article", "post_date": "2017-01-01", "author_id": 110 } PUT /website/article/2 { "title": "second article", "content": "this is my second article", "post_date": "2017-02-01", "author_id": 110 } PUT /website/article/3 { "title": "third article", "content": "this is my third article", "post_date": "2017-03-01", "author_id": 110 }