1. 程式人生 > >elasticsearch實戰三部曲之三:搜尋操作

elasticsearch實戰三部曲之三:搜尋操作

本文是《elasticsearch實戰三部曲》的終篇,作為elasticsearch的核心功能,搜尋的重要性不言而喻,今天的實戰都會圍繞搜尋展開;

系列文章連結

  1. 《elasticsearch實戰三部曲之一:索引操作》
  2. 《elasticsearch實戰三部曲之二:文件操作》
  3. 《elasticsearch實戰三部曲之三:搜尋操作》

環境資訊

  1. 本次實戰用到的elasticsearch版本是6.5.4,安裝在Ubuntu 16.04.5 LTS,客戶端工具是postman6.6.1;
  2. 如果您需要搭建elasticsearch環境,請參考
    《Linux環境快速搭建elasticsearch6.5.4叢集和Head外掛》

基本情況介紹

本次實戰的elasticsearch環境以及搭建完畢,是由兩個機器搭建的叢集,並且elasticsearch-head也搭建完成:

  1. 一號機器,IP地址:192.168.119.152;
  2. 二號機器:IP地址:192.168.119.153;
  3. elasticsearch-head安裝在一號機器,訪問地址:http://192.168.119.152:9100
  4. 已經建立了索引englishbooks,對應的資料如下所示,請用批量命令匯入到elasticsearch:
{"index":{ "_index": "englishbooks", "_type": "IT", "_id": "1" }}
{"id":"1","title":"Deep Learning","language":"python","author":"Yoshua Bengio","price":549.00,"publish_time":"2016-11-18","description":"written by three experts in the field, deep learning is the only comprehensive book on the subject."
} {"index":{ "_index": "englishbooks", "_type": "IT", "_id": "2" }} {"id":"2","title":"Compilers","language":"c","author":"Alfred V.Aho","price":62.50,"publish_time":"2011-01-01","description":"In the time since the 1986 edition of this book, the world of compiler designhas changed significantly."} {"index":{ "_index": "englishbooks", "_type": "IT", "_id": "3" }} {"id":"3","title":"Core Java","language":"java","author":"Horstmann","price":85.90,"publish_time":"2016-06-01","description":"The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "} {"index":{ "_index": "englishbooks", "_type": "IT", "_id": "4" }} {"id":"4","title":"Thinking in Java","language":"java","author":"Bruce Eckel","price":70.10,"publish_time":"2015-07-06","description":"Thinking in Java should be read cover to cover by every Java programmer, then kept close at hand for frequent reference. The exercises are challenging, and the chapter on Collections is superb!"} {"index":{ "_index": "englishbooks", "_type": "IT", "_id": "5" }} {"id":"5","title":"The Go Programming Language","language":"go","author":"Alan A.A.Donovan","price":63.90,"publish_time":"2016-01-01","description":"A declaration's lexical block determines its scope, which may be large or small. The declarations of built—in types, functions, and constants like int, len, and true are in the universe block and can be referred to throughout the entire program."}
  1. 相關的文件是批量匯入的,關於文件資料和批量操作的細節請參考《elasticsearch實戰三部曲之二:文件操作》
  2. books索引的文件內容在head中展示如下圖:
    在這裡插入圖片描述

資料格式說明

為了便於和讀者溝通,我們來約定一下如何在文章中表達請求和響應的資訊:

  1. 假設通過Postman工具向伺服器傳送一個PUT型別的請求,地址是:http://192.168.119.152:9200/test001/article/1
  2. 請求的內容是JSON格式的,內容如下:
{
	“id”:1,
	"title":"標題a",
	"posttime":"2019-01-12",
	"content":"一起來熟悉文件相關的操作"
}

對於上面的請求,我在文章中就以如下格式描述:

PUT test001/article/1

{
	“id”:1,
	"title":"標題a",
	"posttime":"2019-01-12",
	"content":"一起來熟悉文件相關的操作"
}

讀者您看到上述內容,就可以在postman中發起PUT請求,地址是"test001/article/1"前面加上您的伺服器地址,內容是上面的JSON;

本文中的文件內容暫不涉及中文

文中資料都是英文的,避免在因分詞器的分詞問題導致搜尋不到對應的中文結果,分詞器相關的知識會在另一篇文章中詳細介紹;

檢視所有資料

GET englishbooks/_search

{
	"query":{
		"match_all":{}
	}
}

上述查詢返回索引books的所有記錄,並且文件得分收是1;
您可以將請求的整個JSON刪除,只用books/_search這個URL來試試,也能得到所有資料,這是match_all的簡寫;

數字欄位的精確匹配

查詢價格等於549的記錄:

GET englishbooks/_search

{
	"query":{
		"constant_score":{
			"filter":{
				"term":{"price":549}
			}
		}
	}
}

得到結果:

{
    "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 1,
        "hits": [
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "id": "1",
                    "title": "Deep Learning",
                    "language": "python",
                    "author": "Yoshua Bengio",
                    "price": 549,
                    "publish_time": "2016-11-18",
                    "description": "written by three experts in the field, deep learning is the only comprehensive book on the subject."
                }
            }
        ]
    }
}

請求引數中使用了constant_score 後,查詢將以非評分模式來執行 term,並以一作為統一評分;

檢視分詞效果

text型別的欄位會被分詞後構建倒排索引,來看看title欄位的值為"Core Java"時的分詞效果:

GET englishbooks/_analyze

{
	"field":"title",
	"text":"Core Java"
}

響應如下所示,"Core Java"被分"core"和"java"兩個詞,也就是說我們以詞項"core"或"java"搜尋title欄位都能收到對應文件:

{
    "tokens": [
        {
            "token": "core",
            "start_offset": 0,
            "end_offset": 4,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "java",
            "start_offset": 5,
            "end_offset": 9,
            "type": "<ALPHANUM>",
            "position": 1
        }
    ]
}

需要注意的是分詞後的結果都是小寫,這是分詞器的處理結果;

詞項查詢(term query)

前面我們檢視分詞效果發現"Core Java"被分"core"和"java"兩個詞,現在就以"java"為關鍵詞搜尋一下試試:

GET englishbooks/_search

{
	"query":{
		"term":{"title":"java"}
	}
}

結果如下,title中有java關鍵詞的兩個文件都被搜到:

{
    "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 0.5754429,
        "hits": [
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "4",
                "_score": 0.5754429,
                "_source": {
                    "id": "4",
                    "title": "Thinking in Java",
                    "language": "java",
                    "author": "Bruce Eckel",
                    "price": 70.1,
                    "publish_time": "2015-07-06",
                    "description": "Thinking in Java should be read cover to cover by every Java programmer, then kept close at hand for frequent reference. The exercises are challenging, and the chapter on Collections is superb!"
                }
            },
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "3",
                "_score": 0.2876821,
                "_source": {
                    "id": "3",
                    "title": "Core Java",
                    "language": "java",
                    "author": "Horstmann",
                    "price": 85.9,
                    "publish_time": "2016-06-01",
                    "description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "
                }
            }
        ]
    }
}

分詞查詢(match query)

  1. term query的特點是將輸入的內容作為一個詞項來用,例如以下的查詢是沒有結果的:
GET englishbooks/_search

{
	"query":{
		"term":{"title":"core java"}
	}
}

上述查詢沒有結果的原因,是因為"core java"被當做一個詞項去查詢了,而title的分詞結果中只有"core"、"java"這些分詞過的詞項,並沒有一個叫做"core java"的詞項,所以搜不到結果;

  1. 如果輸入的查詢條件"core java"也被做一次分詞處理,再把處理結果"core"和"java"用來搜尋,應該就能得到結果了,match query就是用來對輸入條件做分詞處理的,如下:
GET englishbooks/_search

{
	"query":{
		"match":{"title":"Core Java"}
	}
}

搜尋結果如下,包含了java的兩條記錄都被查出來了:

{
    "took": 8,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 0.5754429,
        "hits": [
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "4",
                "_score": 0.5754429,
                "_source": {
                    "id": "4",
                    "title": "Thinking in Java",
                    "language": "java",
                    "author": "Bruce Eckel",
                    "price": 70.1,
                    "publish_time": "2015-07-06",
                    "description": "Thinking in Java should be read cover to cover by every Java programmer, then kept close at hand for frequent reference. The exercises are challenging, and the chapter on Collections is superb!"
                }
            },
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "3",
                "_score": 0.5753642,
                "_source": {
                    "id": "3",
                    "title": "Core Java",
                    "language": "java",
                    "author": "Horstmann",
                    "price": 85.9,
                    "publish_time": "2016-06-01",
                    "description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "
                }
            }
        ]
    }
}
  1. 如果我們的本意是隻要"Core Java"的匹配結果,上面的結果顯然是不符合要求的,此時可以給查詢條件加個"operator":"and"屬性,就會查詢匹配了所有關鍵詞的文件,注意json的結構略有變化,以前title的屬性是搜尋條件,現在變成了一個json物件,裡面的query屬性是原來的搜尋條件:
GET englishbooks/_search

{
	"query":{
		"match":{
			"title":{
				"query":"Core Java",
				"operator":"and"
			}
		}
	}
}

這次的搜尋結果就是同時匹配了"core"和"java"兩個詞項的記錄了(為什麼core和java是小寫? 因為"Core Java"被分詞後改為了小寫,再去搜索的):

{
    "took": 11,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.5753642,
        "hits": [
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "3",
                "_score": 0.5753642,
                "_source": {
                    "id": "3",
                    "title": "Core Java",
                    "language": "java",
                    "author": "Horstmann",
                    "price": 85.9,
                    "publish_time": "2016-06-01",
                    "description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "
                }
            }
        ]
    }
}

match_phrase搜尋

match_phrase搜尋和前面的match搜尋相似,並且有以下兩個特點:

  1. 分詞後的所有詞項都要匹配上,也就是前面的"operator":"and"屬性的效果;
  2. 分析後的詞項順序要和搜尋欄位的順序一致,才能匹配上;
GET englishbooks/_search

{
	"query":{
		"match_phrase":{"title":"Core Java"}
	}
}

上述查詢可以搜尋到結果,但如果將"Core Java"改成"Java Core"就搜不到結果了,但是match query用"Java Core"是可以搜到結果的;

match_phrase_prefix搜尋

match_phrase_prefix的功能和前面的match_phrase類似,不過match_phrase_prefix支援最後一個詞項做字首匹配,如下所示,"Core J"這個搜尋條件用match_phrase是搜不到結果的,但是match_phrase_prefix可以,因為"J"可以作為字首和"Java"匹配:

GET englishbooks/_search

{
	"query":{
		"match_phrase":{"title":"Core J"}
	}
}

multi_match搜素

multi_match是在match的基礎上支援多欄位搜尋,以下查詢就是用"1986"和"deep"這兩個詞項,同時搜尋title和description兩個欄位:

GET englishbooks/_search

{
	"query":{
		"multi_match":{
			"query":"1986 deep",
			"fields":["title", "description"]
		}
	}
}

響應如下,可見title和description中含有詞項"1986"或者"deep"的文件都被返回了:

{
    "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 0.79237825,
        "hits": [
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "2",
                "_score": 0.79237825,
                "_source": {
                    "id": "2",
                    "title": "Compilers",
                    "language": "c",
                    "author": "Alfred V.Aho",
                    "price": 62.5,
                    "publish_time": "2011-01-01",
                    "description": "In the time since the 1986 edition of this book, the world of compiler designhas changed significantly."
                }
            },
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "1",
                "_score": 0.2876821,
                "_source": {
                    "id": "1",
                    "title": "Deep Learning",
                    "language": "python",
                    "author": "Yoshua Bengio",
                    "price": 549,
                    "publish_time": "2016-11-18",
                    "description": "written by three experts in the field, deep learning is the only comprehensive book on the subject."
                }
            }
        ]
    }
}

terms query

terms是term查詢的升級,用來查詢多個詞項:

GET englishbooks/_search

{
	"query":{
		"terms":{
			"title":["deep", "core"]
		}
	}
}

響應如下,title中含有deep和core的文件都被查到:

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 1,
        "hits": [
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "id": "1",
                    "title": "Deep Learning",
                    "language": "python",
                    "author": "Yoshua Bengio",
                    "price": 549,
                    "publish_time": "2016-11-18",
                    "description": "written by three experts in the field, deep learning is the only comprehensive book on the subject."
                }
            },
            {
                "_index": "englishbooks",
                "_type": "IT",
                "_id": "3",
                "_score": 1,
                "_source": {
                    "id": "3",
                    "title": "Core Java",
                    "language": "java",
                    "author": "Horstmann",
                    "price": 85.9,
                    "publish_time": "2016-06-01",
                    "description": "The book is aimed at experienced programmers who want to learn how to write useful Java applications and applets. "
                }
            }
        ]
    }
}

範圍查詢

range query是範圍查詢,例如查詢publish_time在"2016-01-01"到"2016-12-31"之間的文件:

GET englishbooks/_search

{
	"query":{
		"range":{
			"publish_time":{
				"gte":"2016-01-01",
				"lte":"2016-12-31",
				"format":"yyyy-MM-dd"
			}
		}
	}
}

篇幅所限,此處略去返回結果;

exists query

exists query返回的是欄位中至少有一個非空值的文件:

GET englishbooks/_search

{
	"query":{
		"exists":{
			
            
           

相關推薦

elasticsearch實戰三部曲搜尋操作

本文是《elasticsearch實戰三部曲》的終篇,作為elasticsearch的核心功能,搜尋的重要性不言而喻,今天的實戰都會圍繞搜尋展開; 系列文章連結 《elasticsearch實戰三部曲之一:索引操作》; 《elasticsearch實戰三部曲之二:文

elasticsearch實戰三部曲文件操作

本文是《elasticsearch實戰三部曲》系列的第二篇,上一篇文章我們動手熟悉了索引相關的基本操作,現在一起來熟悉文件相關的操作; 系列文章連結 《elasticsearch實戰三部曲之一:索引操作》; 《elasticsearch實戰三部曲之二:文件操作》;

CoProcessFunction實戰三部曲定時器和側輸出

### 歡迎訪問我的GitHub [https://github.com/zq2599/blog_demos](https://github.com/zq2599/blog_demos) 內容:所有原創文章分類彙總及配套原始碼,涉及Java、Docker、Kubernetes、DevOPS等; ###

Docker下MySQL主從三部曲binlog日誌引數實戰

本章是《Docker下MySQL主從三部曲》的終篇,前面的章節我們能夠製作映象來搭建主從同步環境,本章我們來觀察binlog引數MASTER_LOG_POS;關於從庫同步的設定在設定從庫同步的時候一般會使用以下SQL:CHANGE MASTER TO MASTER_HOST=

自定義spring boot starter三部曲原始碼分析spring.factories載入過程

本文是《自定義spring boot starter三部曲》系列的終篇,前文中我們開發了一個starter並做了驗證,發現關鍵點在於spring.factories的自動載入能力,讓應用只要依賴starter的jar包即可,今天我們來分析Spring和Spring boot原始碼,瞭解s

maven構建docker映象三部曲推送到遠端倉庫(內網和阿里雲)

在上一章《maven構建docker映象三部曲之二:編碼和構建映象》的實戰中,我們將spring boot的web工程構建成docker映象並在本地啟動容器成功,今天我們把docker-maven-plugin外掛的推送功能也用上,這樣編譯、構建、推送都能一

PC軟體開發技術C#操作SQLite資料庫

我們在開發應用是經常會需要用到一些資料的儲存,儲存的方式有多種,使用資料庫是一種比較受大家歡迎的方式。但是對於一些小型的應用,如一些移動APP,通常的資料庫過於龐大,而輕便的SQLite則能解決這一問題。不但操作方便,而且只需要要一個檔案即可,在這裡我們來說一說使用C#語言操作SQLite資料庫

Kubernetes下web服務的效能測試三部曲橫向擴容

本章是《Kubernetes下web服務的效能測試三部曲》系列的終篇,之前我們用AB和JMeter兩種工具壓測了k8s環境下的Tomcat,並通過調整記憶體和CPU來驗證縱向擴容的效果,本章我們來驗證橫向擴容對吞吐量的影響; 本文地址:http://blog.

spring4.1.8初始化原始碼學習三部曲AbstractApplicationContext.refresh方法

本章是《spring4.1.8初始化原始碼學習三部曲》系列的終篇,重點是學習AbstractApplicationContext類的refresh()方法; 我們先回顧ClassPathXmlApplicationContext類的初始化過程如下程式碼:

Docker下ELK三部曲K8S上的ELK和應用日誌上報

本章是《Docker下ELK三部曲》系列的終篇,前面章節已經詳述了ELK環境的搭建以及如何製作自動上報日誌的應用映象,今天我們把ELK和web應用釋出到K8S環境下,模擬多個後臺server同時上報日誌的場景; 前文連結 關於K8S 基礎結

Docker搭建disconf環境,三部曲細說搭建過程

Docker下的disconf實戰全文連結 《Docker搭建disconf環境,三部曲之一:極速搭建disconf》; 《Docker搭建disconf環境,三部曲之二:本地快速構建disconf映象》; 《Docker搭建disconf環境,三部曲之三:細說搭建過程》; 《Docker下使用discon

CDH5部署三部曲問題總結

### 歡迎訪問我的GitHub [https://github.com/zq2599/blog_demos](https://github.com/zq2599/blog_demos) 內容:所有原創文章分類彙總及配套原始碼,涉及Java、Docker、Kubernetes、DevOPS等; ###

Flink on Yarn三部曲提交Flink任務

### 歡迎訪問我的GitHub [https://github.com/zq2599/blog_demos](https://github.com/zq2599/blog_demos) 內容:所有原創文章分類彙總及配套原始碼,涉及Java、Docker、Kubernetes、DevOPS等; 本文是

CDH+Kylin三部曲Kylin官方demo

### 歡迎訪問我的GitHub [https://github.com/zq2599/blog_demos](https://github.com/zq2599/blog_demos) 內容:所有原創文章分類彙總及配套原始碼,涉及Java、Docker、Kubernetes、DevOPS等; 本文是《

CoProcessFunction實戰三部曲狀態處理

### 歡迎訪問我的GitHub [https://github.com/zq2599/blog_demos](https://github.com/zq2599/blog_demos) 內容:所有原創文章分類彙總及配套原始碼,涉及Java、Docker、Kubernetes、DevOPS等; ###

Android實戰技巧十八Handler使用中可能引發的內存泄漏

sha 指向 ons har 引用 destroy 對象 from weak 問題描寫敘述 曾幾何時,我們用原來的辦法使用Handler時會有以下一段溫馨的提示: This Handler class should be static or le

測試開發linux面試後臺進程操作

狀態 很好 分配 例如 名稱 標識 批處理 推薦 子進程 Hi,大家好我是Tom,繼上次分享之後這次給大家帶來新的知識。 進程是Linux系統中一個非常重要的概念。Linux是一個多任務的操作系統,系統上經常同時運行著多個進程。我們不關心這些進程究竟是如何分配的,或者是內核

EOS開發基礎使用cleos命令行客戶端操作EOS——關於錢包wallet和賬戶account

技術 account perm ons vnc HR and limit may 好了,上一節我們已經講了關於wallet的一些基礎操作,基本了解了怎麽去創建一個錢包,怎麽去查看錢包、上鎖和解鎖錢包等,這一節咱們就來開始操作賬戶account吧。 上一節講到了每一個

自定義spring boot starter三部曲實戰開發

本文是《自定義spring boot starter三部曲》的第二篇,上一篇中我們通過學習spring cloud的starter,對spring boot的starter有了初步瞭解,也設計好了實戰內容,今天就來一起實現; 三部曲文章連結 《自定義spring boot

Android實戰技巧十五瞭解native activity

1.native activity的意義 很多人覺得Android的Fwk提供的支援足夠好了,既然Google不推薦用Ndk開發為什麼又放寬Ndk的限制而推出可以無Java開發Android App呢?我的理解是不同的技術實現會有其適合的場景。 Ndk的適用