009-elasticsearch【三】示例數據導入、URI查詢方式簡介、Query DSL簡介、查詢簡述【_source、match、must、should等】、過濾器、聚合

阿新 • • 發佈：2018-03-05

ase emp -h 集合 shard ken 結果 employ 5.1

一、簡單數據

客戶銀行賬戶信息，json

{
    "account_number": 0,
    "balance": 16623,
    "firstname": "Bradshaw",
    "lastname": "Mckenzie",
    "age": 29,
    "gender": "F",
    "address": "244 Columbus Place",
    "employer": "Euron",
    "email": "[email protected]",
    "city": "Hobucken 
",
    "state": "CO"
}

批量導入1000條

測試數據地址

curl -H "Content-Type: application/json" -XPOST ‘localhost:9200/bank/account/_bulk?pretty&refresh‘ --data-binary "@accounts.json"
curl ‘localhost:9200/_cat/indices?v‘

如果windows上需要把單引號改為雙引號

二、URI查詢方式簡介

　　有兩種運行搜索的基本方法：一種是通過REST請求URI發送搜索參數，另一種是通過REST請求主體發送搜索參數。

2.1、請求URL方式

GET /bank/_search?q=*&sort=account_number:asc&pretty

說明：q=* 參數指示Elasticsearch匹配索引中的所有文檔。

sort = account_number：asc參數指示按升序使用每個文檔的account_number字段對結果進行排序。

pretty標識返回漂亮的json格式

{
　　took: 31,
　　timed_out: false,
　　_shards: {
　　　　total: 5,
　　　　successful: 5,
　　　　failed:  
0
　　},
　　hits: {
　　　　total: 1000,
　　　　max_score: null,
　　　　hits: [
　　　　　　{
　　　　　　　　_index: "bank",
　　　　　　　　_type: "account",
　　　　　　　　_id: "0",
　　　　　　　　_score: null,
　　　　　　　　_source: {
　　　　　　　　　　account_number: 0,
　　　　　　　　　　balance: 16623,
　　　　　　　　　　firstname: "Bradshaw",
　　　　　　　　　　lastname: "Mckenzie",
　　　　　　　　　　age: 29,
　　　　　　　　　　gender: "F",
　　　　　　　　　　address: "244 Columbus Place",
　　　　　　　　　　employer: "Euron",
　　　　　　　　　　email: "[email protected]",
　　　　　　　　　　city: "Hobucken",
　　　　　　　　　　state: "CO"
　　　　　　　　},
　　　　　　sort: [0]
　　　　}
　　　　//……
　　　　]
　　}
}

響應值說明

took：Elasticsearch執行搜索的時間（以毫秒為單位）

time_out：搜索是否超時

_shards：搜索了多少片，以及搜索片成功/失敗的次數

hits：搜索結果

hits.total：符合我們搜索條件的文件總數
hits.hits：實際的搜索結果數組（默認為前10個文檔）
hits.sort：對結果進行排序鍵（按分數排序時丟失）
hits._score and max_score：暫時忽略

2.2、請求體方式

使用工具header時候使用post請求

GET /bank/_search
{
  "query": { "match_all": {} },
  "sort": [
    { "account_number": "asc" }
  ]
}

三、Query DSL簡介

　　Elasticsearch提供了一種可用於執行查詢的JSON式特定於領域的語言。這被稱為Query DSL。

註意使用header 工具時應該使用post請求

3.1、查詢所有

GET /bank/_search
{
  "query": { "match_all": {} }
}

　　match_all部分僅僅是我們想要運行的查詢類型。 match_all查詢只是搜索指定索引中的所有文檔。

3.2、查詢數據

GET /bank/_search
{
  "query": { "match_all": {} },
  "size": 1
}

請註意，如果未指定大小，則默認為10。

3.3、返回分頁　　

此示例執行match_all並返回文檔11至20：

GET /bank/_search
{
  "query": { "match_all": {} },
  "from": 10,
  "size": 10
}

from參數（從0開始）指定從哪個文檔索引開始，size參數指定從from參數開始返回多少個文檔。此功能在實現分頁搜索結果時非常有用。請註意，如果from未指定，則默認為0。

3.4、降序

此示例執行match_all並按帳戶余額按降序對結果進行排序，並返回前10個（默認大小）文檔。

GET /bank/_search
{
  "query": { "match_all": {} },
  "sort": { "balance": { "order": "desc" } }
}

四、查詢簡述

4.1、返回指定字段

　　請求是增加_source字段，在概念上與SQL SELECT 字段1 FROM字段列表有些相似。

　　返回的文檔字段。默認情況下，完整的JSON文檔作為所有搜索的一部分返回。這被稱為源（搜索匹配中的_source字段）。如果我們不希望整個源文檔被返回，我們有能力只需要返回源內的幾個字段。

GET /bank/_search
{
  "query": { "match_all": {} },
  "_source": ["account_number", "balance"]
}

4.2、匹配查詢

全匹配　　

  "query": { "match_all": {} },

匹配查詢，它可以被認為是基本的搜索查詢（即針對特定字段或字段集合進行的搜索）。

//匹配account_number=20的數據
GET /bank/_search
{
  "query": { "match": { "account_number": 20 } }
}

//匹配 address = mill
GET /bank/_search
{
  "query": { "match": { "address": "mill" } }
}

//匹配 address =mill 或 lane
GET /bank/_search
{
  "query": { "match": { "address": "mill lane" } }
}

//匹配 address =“mill lane” 全部的
GET /bank/_search
{
  "query": { "match_phrase": { "address": "mill lane" } }
}

4,3、bool（布爾）查詢

4.3.1、must == and

//匹配address=mill 並且 address =lane的文檔 等價於 "query": { "match_phrase": { "address": "mill lane" } }

 GET /bank/_search { "query": { "bool": { "must": [ { "match": { "address": "mill" } }, { "match": { "address": "lane" } } ] } } }

bool must子句指定了一個文檔被認為是匹配的所有查詢。

4.3.2、should==or

//匹配address=mill或者address=lane 等價於 "query": { "match": { "address": "mill lane" } }
GET /bank/_search
{
  "query": {
    "bool": {
      "should": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}

bool should子句指定了一個查詢列表，其中任何一個查詢都必須是真的才能被認為是匹配的文檔。

4.3.3、must_not==not

//地址address！=mill 也 address！=lane
GET /bank/_search
{
  "query": {
    "bool": {
      "must_not": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}

bool must_not子句指定了一個查詢列表，其中任何一個查詢都不應該被認為是匹配的文檔。

4.3.4、組合使用

可以在一個bool查詢中同時結合must，should和must_not子句。此外，我們可以在任何這些bool子句中編寫布爾查詢來模擬任何復雜的多級布爾邏輯。

//返回任何40歲但未居住在ID街道人的所有帳戶
GET /bank/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "age": "40" } }
      ],
      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}

五、過濾器簡述

　　文檔分數（搜索結果中的_score字段）的細節。分數是一個數值，它是文檔與我們指定的搜索查詢匹配度的相對度量。分數越高，文檔越相關，分數越低，文檔的相關性越低。

　　但查詢並不總是需要生成分數，特別是當它們僅用於“過濾”文檔集時。 Elasticsearch檢測這些情況並自動優化查詢執行，以便不計算無用分數。

　　bool查詢還支持篩選子句，它允許使用查詢來限制將由其他子句匹配的文檔，而不會更改計算分數的方式。範圍查詢，它允許我們通過一系列值來過濾文檔。這通常用於數字或日期過濾。

5.1、rang 過濾

//查找余額大於或等於20000且小於等於30000的帳戶。
GET /bank/_search
{
  "query": {
    "bool": {
      "must": { "match_all": {} },
      "filter": {
        "range": {
          "balance": {
            "gte": 20000,
            "lte": 30000
          }
        }
      }
    }
  }
}

解析，bool查詢包含一個match_all查詢（查詢部分）和一個範圍查詢（過濾器部分）。我們可以將任何其他查詢替換為查詢和過濾器部分。範圍查詢非常有意義，因為落入該範圍的文檔全部匹配“平等”，即沒有文檔比另一個更重要。

六、聚合

　　聚合提供了從數據中分組和提取統計數據的功能。考慮聚合的最簡單方法是將其大致等同於SQL GROUP BY和SQL聚合函數。在Elasticsearch中，您可以執行返回匹配的搜索，同時還可以在一個響應中返回與匹配不同的聚合結果。這是非常強大和高效的，因為您可以運行查詢和多個聚合，並使用簡潔和簡化的API避免網絡往返，從而一次性獲得兩種（或兩種）操作的結果。

6.1、group by、count

//使用state街道對所有帳戶進行分組，然後返回按降序（也是默認值）排序的前10個（默認）狀態：
GET /bank/_search
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword"
      }
    }
  }
}

相當於數據庫

SELECT state, COUNT(*) FROM bank GROUP BY state ORDER BY COUNT(*) DESC

響應結果

{
"took": 49,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1000,
"max_score": 0,
"hits": [ ]
},
"aggregations": {
"group_by_state": {
"doc_count_error_upper_bound": 20,
"sum_other_doc_count": 770,
"buckets": [
{
"key": "ID",
"doc_count": 27
}
,
{
"key": "TX",
"doc_count": 27
}
,
{
"key": "AL",
"doc_count": 25
}
,
{
"key": "MD",
"doc_count": 25
}
,
{
"key": "TN",
"doc_count": 23
}
,
{
"key": "MA",
"doc_count": 21
}
,
{
"key": "NC",
"doc_count": 21
}
,
{
"key": "ND",
"doc_count": 21
}
,
{
"key": "ME",
"doc_count": 20
}
,
{
"key": "MO",
"doc_count": 20
}
]
}
}
}

View Code

註意，我們將size = 0設置為不顯示搜索匹配，因為我們只想查看響應中的聚合結果。

6.2、group by 、count，avg

//按state州 計算平均賬戶余額（再次僅針對按計數降序排列的前10個州）：
GET /bank/_search
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword"
      },
      "aggs": {
        "average_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}

註意，如何在group_by_state聚合內嵌套average_balance聚合。這是所有聚合的通用模式。可以任意嵌套聚合內的聚合，以便從數據中提取所需的旋轉摘要。

按降序對平均余額進行排序：

GET /bank/_search
{
  "size": 0,
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "state.keyword",
        "order": {
          "average_balance": "desc"
        }
      },
      "aggs": {
        "average_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}

如何按年齡段（20-29歲，30-39歲和40-49歲）進行分組，然後按性別進行分組，然後最終得出每個性別的年齡段平均賬戶余額：

GET /bank/_search
{
  "size": 0,
  "aggs": {
    "group_by_age": {
      "range": {
        "field": "age",
        "ranges": [
          {
            "from": 20,
            "to": 30
          },
          {
            "from": 30,
            "to": 40
          },
          {
            "from": 40,
            "to": 50
          }
        ]
      },
      "aggs": {
        "group_by_gender": {
          "terms": {
            "field": "gender.keyword"
          },
          "aggs": {
            "average_balance": {
              "avg": {
                "field": "balance"
              }
            }
          }
        }
      }
    }
  }
}

View Code

更多聚合：https://www.elastic.co/guide/en/elasticsearch/reference/5.4/search-aggregations.html

009-elasticsearch【三】示例數據導入、URI查詢方式簡介、Query DSL簡介、查詢簡述【_source、match、must、should等】、過濾器、聚合

ase emp -h 集合 shard ken 結果 employ 5.1 一、簡單數據客戶銀行賬戶信息，json { "account_number": 0, "balance": 16623, "firstname": "Brad

009-elasticsearch【三】示例數據導入、URI查詢方式簡介、Query DSL簡介、查詢簡述【_source、match、must、should等】、過濾器、聚合

一、簡單數據

二、URI查詢方式簡介

2.1、請求URL方式

2.2、請求體方式

三、Query DSL簡介

3.1、查詢所有

3.2、查詢數據

3.3、返回分頁

3.4、降序

四、查詢簡述

4.1、返回指定字段

4.2、匹配查詢

4,3、bool（布爾）查詢

4.3.1、must == and

4.3.2、should==or

4.3.3、must_not==not

4.3.4、組合使用

五、過濾器簡述

5.1、rang 過濾

六、聚合

6.1、group by、count

6.2、group by 、count，avg

相關推薦

3.3、返回分頁