1. 程式人生 > >elasticsearch 中文分詞(elasticsearch-analysis-ik)安裝

elasticsearch 中文分詞(elasticsearch-analysis-ik)安裝

star 最好 好玩的 failed dex source 在線 3.0 github

elasticsearch 中文分詞(elasticsearch-analysis-ik)安裝

下載最新的發布版本
https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip

在elasticsearch的plugins目錄下,創建ik目錄

cd /usr/local/elasticsearch-6.3.0/plugins
mkdir ik

將解壓的內容,放入其中
技術分享圖片

重新啟動elasticsearch服務

elasticsearch restart

這個時候中文分詞就生效了,數據重新插入即可

GET /megacorp/employee/_search
{
    "query" : {
        "match" : {
            "about" : "程序員 編程"
        }
    }
}

搜索結果

{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1.654172,
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "2",
        "_score": 1.654172,
        "_source": {
          "first_name": "張",
          "last_name": "三",
          "age": 24,
          "about": "一個PHP程序員,熱愛編程,熱愛生活,充滿激情。",
          "interests": [
            "英雄聯盟"
          ]
        }
      }
    ]
  }
}

或者通過(elasticsearch-plugin)在線安裝,速度有點慢。

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip
[=================================================] 100%?? 
-> Installed analysis-ik

技術分享圖片
發現多了一個文件夾

使用

GET _analyze?pretty
{
  "analyzer": "ik_smart",
  "text": "中華人民共和國國歌"
}
{
  "tokens": [
    {
      "token": "中華人民共和國",
      "start_offset": 0,
      "end_offset": 7,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "國歌",
      "start_offset": 7,
      "end_offset": 9,
      "type": "CN_WORD",
      "position": 1
    }
  ]
}

再一個例子

GET _analyze?pretty
{
  "analyzer": "ik_smart",
  "text": "王者榮耀是最好玩的遊戲"
}
{
  "tokens": [
    {
      "token": "王者",
      "start_offset": 0,
      "end_offset": 2,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "榮耀",
      "start_offset": 2,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 1
    },
    {
      "token": "是",
      "start_offset": 4,
      "end_offset": 5,
      "type": "CN_CHAR",
      "position": 2
    },
    {
      "token": "最",
      "start_offset": 5,
      "end_offset": 6,
      "type": "CN_CHAR",
      "position": 3
    },
    {
      "token": "好玩",
      "start_offset": 6,
      "end_offset": 8,
      "type": "CN_WORD",
      "position": 4
    },
    {
      "token": "的",
      "start_offset": 8,
      "end_offset": 9,
      "type": "CN_CHAR",
      "position": 5
    },
    {
      "token": "遊戲",
      "start_offset": 9,
      "end_offset": 11,
      "type": "CN_WORD",
      "position": 6
    }
  ]
}

elasticsearch 中文分詞(elasticsearch-analysis-ik)安裝