# Language Analyzers（语言分析器）

一组用于分析特定语言文本的分析器。 支持以下类型：[`arabic`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#arabic-analyzer), [`armenian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#armenian-analyzer), [`basque`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#basque-analyzer), [`brazilian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#brazilian-analyzer), [`bulgarian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#bulgarian-analyzer), [`catalan`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#catalan-analyzer), [`cjk`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#cjk-analyzer),[`czech`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#czech-analyzer), [`danish`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#danish-analyzer), [`dutch`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#dutch-analyzer), [`english`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#english-analyzer), [`finnish`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#finnish-analyzer), [`french`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#french-analyzer), [`galician`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#galician-analyzer), [`german`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#german-analyzer), [`greek`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#greek-analyzer), [`hindi`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#hindi-analyzer),[`hungarian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#hungarian-analyzer), [`indonesian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#indonesian-analyzer), [`irish`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#irish-analyzer), [`italian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#italian-analyzer), [`latvian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#latvian-analyzer), [`lithuanian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#lithuanian-analyzer), [`norwegian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#norwegian-analyzer), [`persian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#persian-analyzer),[`portuguese`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#portuguese-analyzer), [`romanian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#romanian-analyzer), [`russian`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#russian-analyzer), [`sorani`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#sorani-analyzer), [`spanish`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#spanish-analyzer), [`swedish`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#swedish-analyzer), [`turkish`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#turkish-analyzer), [`thai`](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/analysis-lang-analyzer.html#thai-analyzer).

## **配置语言分析仪** <a href="#id-yu-yan-fen-xi-qi-pei-zhi-yu-yan-fen-xi-yi" id="id-yu-yan-fen-xi-qi-pei-zhi-yu-yan-fen-xi-yi"></a>

### Stopwords(停止词) <a href="#id-yu-yan-fen-xi-qi-stopwords-ting-zhi-ci" id="id-yu-yan-fen-xi-qi-stopwords-ting-zhi-ci"></a>

所有分析仪都支持在配置内部设置自定义停用词，也可以通过设置stopwords\_path来使用外部的停用词。 检查Stop Analyzer了解更多详细信息。

### Excluding words from stemming(排除词干) <a href="#id-yu-yan-fen-xi-qi-excludingwordsfromstemming-pai-chu-ci-gan" id="id-yu-yan-fen-xi-qi-excludingwordsfromstemming-pai-chu-ci-gan"></a>

stem\_exclusion参数允许您指定不应该被阻止的小写字母数组。 在内部，通过将关键字设置为该值的keyword\_marker token filter 来实现此功能

### Reimplementing language analyzers（重新实现语言分析器） <a href="#id-yu-yan-fen-xi-qi-reimplementinglanguageanalyzers-zhong-xin-shi-xian-yu-yan-fen-xi-qi" id="id-yu-yan-fen-xi-qi-reimplementinglanguageanalyzers-zhong-xin-shi-xian-yu-yan-fen-xi-qi"></a>

内置语言分析器可以作为custom analyzers（如下所述）重新实现，以便自定义其行为。

&#x20;**笔记：**&#x5982;果您不打算排除单词被干扰（相当于上面的stem\_exclusion参数），那么您应该从custom analyzer配置中删除keyword\_marker token filter。

## `arabic 分析器` <a href="#id-yu-yan-fen-xi-qi-arabic-fen-xi-qi" id="id-yu-yan-fen-xi-qi-arabic-fen-xi-qi"></a>

arabic 分析器可以如以下定制分析仪重新实现：

```
{
  "settings": {
    "analysis": {
      "filter": {
        "arabic_stop": {
          "type":       "stop",
          "stopwords":  "_arabic_"  # 1
        },
        "arabic_keywords": {
          "type":       "keyword_marker",
          "keywords":   []          # 2
        },
        "arabic_stemmer": {
          "type":       "stemmer",
          "language":   "arabic"
        }
      },
      "analyzer": {
        "arabic": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "arabic_stop",
            "arabic_normalization",
            "arabic_keywords",
            "arabic_stemmer"
          ]
        }
      }
    }
  }
}
```

可以使用无效词或stopwords\_path参数覆盖默认的停用词。

应该删除此过滤器，除非有字词应该排除在干扰之外。

## armenian 分析器 <a href="#id-yu-yan-fen-xi-qi-armenian-fen-xi-qi" id="id-yu-yan-fen-xi-qi-armenian-fen-xi-qi"></a>

分析器可以如以下定制分析仪重新实现：

```
{
  "settings": {
    "analysis": {
      "filter": {
        "armenian_stop": {
          "type":       "stop",
          "stopwords":  "_armenian_"
        },
        "armenian_keywords": {
          "type":       "keyword_marker",
          "keywords":   []
        },
        "armenian_stemmer": {
          "type":       "stemmer",
          "language":   "armenian"
        }
      },
      "analyzer": {
        "armenian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "armenian_stop",
            "armenian_keywords",
            "armenian_stemmer"
          ]
        }
      }
    }
  }
}
```

语言分词器大同小异: 其他的请看官方文档:[Language Analyzers](https://www.elastic.co/guide/en/elasticsearch/reference/5.2/analysis-lang-analyzer.html)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://xiaoxiami.gitbook.io/elasticsearch/ji-chu/33-analysisfen-679029/333analyzersfen-xi-566829/language-analyzersff08-yu-yan-fen-xi-qi-ff09.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
