lucene 搜尋功能介紹（1）

阿新 • • 發佈：2018-11-30

首先使用搜索功能前需要先建立索引：

/**
* 建立索引
* @author 王晨
*
*/
public class Indexer {

private IndexWriter writer; //寫索引例項

/**
* 構造方法傳入索引所在的資料夾
* @param indexDir
*/
public Indexer(String indexDir) throws Exception{
Directory dir = FSDirectory.open(Paths.get(indexDir)); //獲取到索引所在的檔案路徑

Analyzer analyzer = new StandardAnalyzer(); //標準分詞器
IndexWriterConfig conf = new IndexWriterConfig(analyzer);
writer = new IndexWriter(dir,conf);
}

/**
* 關閉寫索引（像流一樣需要關閉）
* @throws Exception
*/
public void close() throws Exception{
writer.close();
}

/**
* 對每個檔案進行遍歷一個一個的索引（索引指定目錄中的檔案）

* @param indexDir
* @return
* @throws Exception
*/
public int index(String indexDir) throws Exception{
File[] files = new File(indexDir).listFiles(); //獲取到當前目錄下的所有檔案
for(File file : files){
indexFile(file);
}
return writer.numDocs(); //返回索引的檔案個數
}

/**
* 索引指定檔案
* @param file
* @throws Exception

*/
private void indexFile(File file) throws Exception{
// TODO Auto-generated method stub
System.out.println("當前索引的檔案："+file.getCanonicalPath());
Document doc = getDocument(file);
writer.addDocument(doc);
}

//獲取文件文件裡在設定每個欄位文件中每一行為一個document
/**
* 獲取文件
* @param file
* @return
* @throws Exception
*/
private Document getDocument(File file) throws Exception{
// TODO Auto-generated method stub
Document doc = new Document();
doc.add(new TextField("contents", new FileReader(file)));
doc.add(new TextField("filename", file.getName(), Field.Store.YES)); //將檔名加入到索引中
doc.add(new TextField("fullPath", file.getCanonicalPath(), Field.Store.YES));

return doc;
}

public static void main(String[] args) {
String indexDir = "D:\\lucene";
String dataDir = "D:\\lucene\\data";
Indexer indexer = null;
int numIndexed = 0;
long start = System.currentTimeMillis();
long end = 0;
try {
indexer = new Indexer(indexDir); //輸出索引的目錄
numIndexed = indexer.index(dataDir); //構建索引返回構建索引的個數
end = System.currentTimeMillis();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}finally{
try {
indexer.close();
System.out.println("索引了的檔案個數"+numIndexed+"一共花費了時間"+(end-start)+"毫秒");
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}

}

搜尋功能的實現使用兩種方法對特定的項進行搜尋或使用查詢表示式QueryParser

①準備工作

         private Directory dir; private IndexReader reader; private IndexSearcher is;
        @Before         public void setUp() throws Exception {             dir = FSDirectory.open(Paths.get("D:\\lucene")); //索引所在的目錄             reader = DirectoryReader.open(dir); //讀取索引             is = new IndexSearcher(reader);         }
        @After         public void tearDown() throws Exception {             reader.close();         }

首先介紹對特定的項進行搜尋（此方法並不常用）TermQuery

/**
* termQuery 查詢對特定項進行搜尋必須完全匹配才會查出來
* @throws Exception
*/
@Test
public void testTermQuery() throws Exception{
String searchField = "contents"; //在哪個欄位查詢
String str = "java"; //使用者要查詢的欄位
Term t = new Term(searchField, str);
Query query = new TermQuery(t);
TopDocs hits = is.search(query, 10);
for(ScoreDoc scoreDoc : hits.scoreDocs){
Document document = is.doc(scoreDoc.doc);
System.out.println(document.get("fullPath"));
}

}

使用解析查詢表示式進行搜尋 QueryParser

/**
* 解析查詢表示式 queryParser 若需要兩個欄位匹配其中的一個使用空格空開即可兩個都匹配使用AND連線使用~可以通用匹配 particula~
* @throws Exception
*/
@Test
public void testQueryParser() throws Exception{
Analyzer analyzer = new StandardAnalyzer(); //標準分詞器
String searchField = "contents";
String str = "TermQuery"; // java AND php jav~ java php
QueryParser parser = new QueryParser(searchField,analyzer); //查詢解析
Query query = parser.parse(str);
TopDocs hits = is.search(query, 10);
System.out.println("匹配 "+str+" 共有"+hits.totalHits+"個記錄");
for(ScoreDoc scoreDoc : hits.scoreDocs){
Document doc = is.doc(scoreDoc.doc);
System.out.println(doc.get("fullPath"));
}

}

如有需要使用分頁的功能比如需要分10頁每頁10條有兩種實現方法：由於lucene沒有提供分頁的功能。

①使用is.search()查詢出100條資料，放在記憶體當中比如放在list裡面每次點選下一頁返回

不同的資料

②使用is.search()查詢出100條資料，每次點選重新的獲取到100條資料在for迴圈中返回不同的資料

推薦使用第二種方法：is.search()每次查詢速度很快。而且當併發量很大時全部存在記憶體當中會對記憶體造成很大的壓力容易出現問題。

lucene 搜尋功能介紹（1）

lucene 搜尋功能介紹（1）

lucene 搜尋功能介紹（2）

JasperReports新功能介紹（1.0.0版以後）

linux的shell基礎介紹（1）

條件隨機場介紹（1）—— An Introduction to Conditional Random Fields

[數據分析工具] Pandas 功能介紹（二）

課後筆記一：Python基礎語法介紹（1）

[轉]Xilinx Vivado的使用詳細介紹（1）：創建工程、編寫代碼、行為仿真、Testbench

Docker 1.12新功能探索（1）：centos7上安裝docker1.12

Kettle控制元件介紹（1）：生成記錄、自定義常量資料

Docker 1 12新功能探索（1） centos7上安裝docker1 12

爬蟲開發python工具包介紹（1）

Python--列表（list）、元組(tuple)、字典（dict）詳細介紹（1）

Linux中的許可權介紹（1）-chown/chgrp/chmod

cas單點登入原理簡單介紹（1）

mysql常用資料結構介紹（1）

關於手機的錄屏功能小記（1）

記憶化搜尋題目總結（1）

KVM 介紹（1）：簡介及安裝

STM32之RTC實時時鐘庫函式介紹（1）

lucene 搜尋功能介紹（1）

相關推薦