java springboot 結合elasticsearch 實現全文檢索 的步驟,有坑請繞行
開啟springboot專案
首先我這裡選擇的是jestClient操作elasticsearch
這裡還有一種方式是通過
ElasticsearchRepostiry類似jpa的一種工具介面,但會隨著ela的版本的修改而變化程式碼,所以首選jestClient
ok!第一步先匯入依賴
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
<version>1.5.4.RELEASE</version>
</dependency>
<dependency>
<groupId>io.searchbox</groupId>
<artifactId>jest</artifactId>
</dependency>
<dependency>
<groupId>net.java.dev.jna</groupId>
<artifactId>jna</artifactId>
</dependency>
這裡需要注意 ①
springboot對應的elasticsearch的版本
這裡sprigboot是1.5.4,ela依賴也是1.5.4
springboot 和elasticsearch 版本對應參照請看下面
第二步在application.properties中配置ela 服務地址連線上地址我們才能去呼叫服務
#elasticsearch [email protected]@ spring.elasticsearch.jest.read-timeout=60000 spring.elasticsearch.jest.connection-timeout=60000
注:@[email protected] 這個是從pom.xml檔案中讀取出來的
然後咱們需要去連線服務,咱們需要獲取jestClient物件去操作查詢
第三步獲取jestClient物件的方式
package com.webi.welive.util;
import io.searchbox.client.JestClient;
import io.searchbox.client.JestClientFactory;
import io.searchbox.client.config.HttpClientConfig;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;
/**
* Title: 獲取jestClient物件<br>
* Description: JestClientUtil<br>
* Company:韋博英語線上教育部</br>
* CreateDate:2018年06月14日 14:29
*
* @author james.fxy
*/
@Service
public class JestClientUtil {
private static String spring_elasticsearch_jest_uris;
private static Integer spring_elasticsearch_jest_read_timeout;
private static Integer Spring_elasticsearch_jest_connection_timeout;
@Value("${spring.elasticsearch.jest.uris}")
public void setSpring_elasticsearch_jest_uris(String spring_elasticsearch_jest_uris) {
JestClientUtil.spring_elasticsearch_jest_uris = spring_elasticsearch_jest_uris;
}
@Value("${spring.elasticsearch.jest.read-timeout}")
public void setSpring_elasticsearch_jest_read_timeout(Integer spring_elasticsearch_jest_read_timeout) {
JestClientUtil.spring_elasticsearch_jest_read_timeout = spring_elasticsearch_jest_read_timeout;
}
@Value("${spring.elasticsearch.jest.connection-timeout}")
public void setGetSpring_elasticsearch_jest_connection_timeout(Integer Spring_elasticsearch_jest_connection_timeout) {
JestClientUtil.Spring_elasticsearch_jest_connection_timeout = Spring_elasticsearch_jest_connection_timeout;
}
/**
* Title: 獲取jestClient<br>
* Description: <br>
* CreateDate: 2018/6/14 16:31<br>
*
* @param
* @return
* @throws Exception
* @category 獲取jestClient
* @author james.fxy
*/
public static JestClient getJestClient() {
JestClientFactory factory = new JestClientFactory();
factory.setHttpClientConfig(new HttpClientConfig.Builder(spring_elasticsearch_jest_uris).connTimeout(Spring_elasticsearch_jest_connection_timeout).readTimeout(spring_elasticsearch_jest_read_timeout).multiThreaded(true).build());
return factory.getObject();
}
}
第四步我們使用jestClient操作elasticsearch
① 選擇我們需要操作的實體類
package com.webi.welive.lessonhomework.param;
import com.webi.welive.lessonhomework.entity.HomeworkAnswerMedia;
import com.webi.welive.lessonhomework.entity.HomeworkQuestionAnswer;
import com.webi.welive.lessonhomework.entity.HomeworkQuestionMedia;
import lombok.Data;
import org.springframework.data.elasticsearch.annotations.Document;
import java.util.Date;
import java.util.List;
/**
* Title: LessonHomeworkParam<br>
* Description: LessonHomeworkParam<br>
* Company: 韋博英語線上教育部<br>
* CreateDate:2018年6月9日 上午11:39:42
*
* @author james.fxy
*/
@Data
@Document(indexName = "homework", type = "homeworktable")
public class LessonHomeworkParam {
private Integer id;
private String question;
private String explain;
private Boolean isEnabled;
private Boolean isDeleted;
private Integer createUserId;
private Integer sequence;
private Integer questionTypes;
private HomeworkQuestionMediaParam homeworkQuestionsMediaParam;
private List<HomeworkQuestionAnswerParam> homeworkQuestionAnswerParamList;
public LessonHomeworkParam() {
super();
}
}
注:@Document(indexName = "homework", type = "homeworktable")
indexName h和type分別對應著你在往elasticsearch匯入資料時設定的
我在匯入資料時是這樣設定的如下:
input {
jdbc {
jdbc_connection_string => "jdbc:sqlserver://10.0.0.130:1433;databaseName=Webi_WeLiveDBTest;"
jdbc_user => "speakhi_user"
jdbc_password => "speakhi_user123"
jdbc_driver_library => "/usr/share/logstash/mssql-jdbc-6.2.1.jre8.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_paging_enabled => "true"
statement => "SELECT * from Lesson_Homework where CreateTime < GETDATE()"
schedule => "* * * * *"
type => "lesson_homework"
}
output {
stdout {
codec => rubydebug
}
if[type] == "lesson_homework"{
elasticsearch {
hosts => "10.0.0.13:9200"
index => "homework"
document_type => "homeworktable"
document_id => "%{id}"
}
}
}
可以看到我在匯入elasticsearch匯入的資料配置的indexName和type對應實體類中的document
② 使用jestclient操作查詢語句
字串拼接以後執行,根據官網的文件來的
具體的拼接資料方式可以參考如下
獲取資料的方式解析方式有兩種
我這裡使用了手動解析的方式,因為發現自動解析不好用
注:因為資料在匯入elasticsearch的時候型別會是通過jackJson序列化過去的
我們在解析資料的時候需要保持引數型別是從elasticsearch過來的
/**
* Title: <br>
* Description: 使用全文檢索查詢課程資訊(新增過濾條件根據question查詢)<br>
* CreateDate: 2018/6/14 14:47<br>
*
* @param query
* @return com.mingyisoft.javabase.bean.CommonJsonObject<java.lang.Object>
* @throws Exception
* @category @author james.fxy
*/
public CommonJsonObject<Object> findAllHomeworkByElasticsearch(String query) throws Exception {
CommonJsonObject<Object> json = new CommonJsonObject<>();
JestClient jestClient = null;
try {
jestClient = JestClientUtil.getJestClient();
// 判斷一下需要查詢的query欄位第一個欄位和最後一個欄位是否是雙引號,給進行一個去除雙引號的處理
if (query.startsWith("\"")) {
query = query.substring(1);
}
if (query.endsWith("\"")) {
query = query.substring(0, query.length() - 1);
}
String queryStr = " {\"query\": { \"match\": { \"question\":\"" + query + "\" } }\n" +
" ,\n" +
" \"post_filter\": { \n" +
" \"term\" : {\n" +
" \"isdeleted\" : \"false\"\n" +
" }\n" +
" }\n" +
"}";
json = search(jestClient, indexName, typeName, queryStr);
jestClient.shutdownClient();
} catch (Exception e) {
json.setCode(ErrorCodeEnum.ELASTIC_SEARCH_HAS_ERROR.getCode());
json.setMsg(ErrorCodeEnum.ELASTIC_SEARCH_HAS_ERROR.getDescription());
e.printStackTrace();
}
return json;
}
將jestClient和查詢語句一起傳過去,使用jestClient執行查詢
得出結果資料解析資料的過程如下
實際上這就是queryStr字串在 kibana上執行所得到的結果,我們將得到的結果進行一個json的序列化解析反饋給前端
/**
* Title:全文檢索課後作業 <br>
* Description: 全文檢索方法<br>
* CreateDate: 2018/6/14 14:44<br>
*
* @param jestClient
* @param indexName 索引名稱
* @param typeName 索引型別
* @param query 查詢語句
* @return com.mingyisoft.javabase.bean.CommonJsonObject<java.lang.Object>
* @throws Exception
* @category
* @author james.fxy
*/
public static CommonJsonObject<Object> search(JestClient jestClient, String indexName, String typeName, String query) throws Exception {
CommonJsonObject<Object> json = new CommonJsonObject<>();
// List<LessonHomeworkParam> lessonHomeworkParams = new ArrayList<>();
Search search = new Search.Builder(query)
.addIndex(indexName)
.addType(typeName)
.build();
JestResult jr = jestClient.execute(search);
// System.out.println("全文搜尋--" + jr.getJsonString());
//自動解析
// System.out.println("全文搜尋--" + jr.getSourceAsObject(User.class));
// List<SearchResult.Hit<LessonHomeworkParam, Void>> jrList;
// jrList = ((SearchResult) jr).getHits(LessonHomeworkParam.class);
// for (SearchResult.Hit<LessonHomeworkParam, Void> lessonHomeworkParamVoidHit : jrList) {
// LessonHomeworkParam lessonHomeworkParam = lessonHomeworkParamVoidHit.source;
// lessonHomeworkParams.add(lessonHomeworkParam);
// }
// json.setData(lessonHomeworkParams);
// return json;
// }
// 手動解析
JsonObject jsonObject = jr.getJsonObject();
JsonObject hitsobject = jsonObject.getAsJsonObject("hits");
long took = jsonObject.get("took").getAsLong();
long total = hitsobject.get("total").getAsLong();
JsonArray jsonArray = hitsobject.getAsJsonArray("hits");
System.out.println("took:" + took + " " + "total:" + total);
List<LessonHomeworkParam> lessonHomeworkParams = new ArrayList<LessonHomeworkParam>();
for (int i = 0; i < jsonArray.size(); i++) {
JsonObject jsonHitsObject = jsonArray.get(i).getAsJsonObject();
// 獲取返回欄位
JsonObject sourceObject = jsonHitsObject.get("_source").getAsJsonObject();
// 封裝LessonHomeworkParam物件
LessonHomeworkParam lessonHomeworkParam = new LessonHomeworkParam();
lessonHomeworkParam.setId(Integer.parseInt(sourceObject.get("id").getAsNumber().toString()));
lessonHomeworkParam.setExplain(sourceObject.get("explain").getAsString());
lessonHomeworkParam.setQuestion(sourceObject.get("question").getAsString());
// lessonHomeworkParam.setCreateUserId(Integer.parseInt(sourceObject.get("createuserid").getAsNumber().toString()));
lessonHomeworkParam.setQuestionTypes(Integer.parseInt(sourceObject.get("questiontypes")
.getAsNumber().toString()));
lessonHomeworkParam.setIsDeleted(sourceObject.get("isdeleted").getAsBoolean());
lessonHomeworkParam.setIsEnabled(sourceObject.get("isenabled").getAsBoolean());
lessonHomeworkParams.add(lessonHomeworkParam);
}
json.setData(lessonHomeworkParams);
return json;
}
給大家展示一下queryStr在 kibana 上執行的結果
從kibana 上拿到的資料結果也就是我們在java程式碼中解析的資料
此處需要注意一點就是elasticsearch本身的資料使用jackson進行序列化了
kibana解析查詢出來的資料解釋:
kibana上查詢出來的資料
例如:
{
"took": 40,
"timed_out": false,
"_shards": {
"total": 27,
"successful": 27,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1529,
"max_score": 5.710427,
"hits": [
{
"_index": "catalog",
"_type": "catalogtable",
"_id": "2406",
"_score": 5.710427,
"_source": {
"@timestamp": "2018-06-25T01:23:00.031Z",
"sequence": 8,
"id": 2406,
"isdeleted": false,
"parentid": 2197,
"createuserid": 10,
"name": "2",
"@version": "1",
"type": "lesson_catalog",
"createtime": "2018-06-23T08:20:35.183Z"
}
}
took欄位表示該操作的耗時(單位為毫秒),timed_out欄位表示是否超時,hits欄位表示命中的記錄 total:返回記錄數。max_score:最高的匹配程度
hits:返回的記錄組成的陣列。
我們在解析資料時需要注意格式轉換
如下程式碼:
// 獲取返回欄位
JsonObject sourceObject = jsonHitsObject.get("_source").getAsJsonObject();
// 封裝LessonHomeworkParam物件
LessonHomeworkParam lessonHomeworkParam = new LessonHomeworkParam();
lessonHomeworkParam.setId(Integer.parseInt(sourceObject.get("id").getAsNumber().toString()));
lessonHomeworkParam.setExplain(sourceObject.get("explain").getAsString());
lessonHomeworkParam.setQuestion(sourceObject.get("question").getAsString());
// lessonHomeworkParam.setCreateUserId(Integer.parseInt(sourceObject.get("createuserid").getAsNumber().toString()));
lessonHomeworkParam.setQuestionTypes(Integer.parseInt(sourceObject.get("questiontypes")
.getAsNumber().toString()));
lessonHomeworkParam.setIsDeleted(sourceObject.get("isdeleted").getAsBoolean());
lessonHomeworkParam.setIsEnabled(sourceObject.get("isenabled").getAsBoolean());
中間資料的格式轉換就是我們需要的去手動解析的
注:具體的型別對應如下圖 type中的型別,我們需要進行型別的轉換,然後才能拿出自己想要的資料
至此,java的elasticsearch全文檢索程式碼部分
另有部落格介紹elasticsearch的原理配置