# Thai Tokenizer（泰语分词器）

**thai tokenizer（泰语分词器）** 将泰文文本分成单词，使用的是 java 的泰语分割算法。文本中的其他语言按照**standard tokenizer** 处理。

> 注意:
>
> 不是所有的 JRE 都支持这个分词器，目前已知在Sun/Oracle 和 OpenJDK 运行正常。 如果您的应用程序需要完全可移植，可考虑使用**ICU Tokenizer**来代替。

## **输出示例**

```
POST _analyze
{
  "tokenizer": "thai",
  "text": "การที่ได้ต้องแสดงว่างานดี"
}
```

上面的句子会生成如下的词元：

```
[ การ, ที่, ได้, ต้อง, แสดง, ว่า, งาน, ดี ]
```

## **配置**

**thai tokenizer（泰语分词器）**&#x4E0D;支持配置。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://xiaoxiami.gitbook.io/elasticsearch/ji-chu/33-analysisfen-679029/334-tokenizersff08-fen-ci-qi-ff09/thai-tokenizerff08-tai-yu-fen-ci-qi-ff09.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
