Lucene多執行緒操作實現

阿新 • • 發佈：2019-01-12

1. 允許任意多的讀操作併發，即任意數量使用者可同時對同一索引做檢索操作。
2. 即便正在進行索引修改操作(索引優化、新增文件、刪除文件)，依然允許任意多的檢索操作併發執行。
3. 不允許併發修改操作，也就是說同一時間只允許一個索引修改操作。

Lucene內部已經對多執行緒安全進行了處理，很多操作都使用了 lock 進行多執行緒同步鎖定。只要遵循一定的規則，就可以在多執行緒環境下安全執行 Lucene。
方案一：
建議：

1. Directotry、Analyzer 都是多執行緒安全型別，只需建立一個 Singleton 物件即可。
2. 所有執行緒使用同一個 IndexModifier 物件進行索引修改操作。
3. IndexWriter/IndexReader/IndexModifier/IndexSearcher 最好使用同一個 Directory 物件，否則多執行緒併發讀寫時可能引發 FileNotFoundException。

IndexModifier 物件封裝了 IndexWriter 和 IndexReader 的常用操作，其內部實現了多執行緒同步鎖定。使用 IndexModifier 可避免同時使用 IndexWriter 和 IndexReader 時需要在多個物件之間進行同步的麻煩。等所有修改操作完成後，記住呼叫 Close() 方法關閉相關資源。並不是每次操作都需要呼叫 Optimize()，可以依據特定情況，定期執行優化操作。

--------

以下演示程式碼簡單封裝了一個 IndexModifier Signleton 型別，確保多執行緒使用同一個物件，且只能由最後一個多執行緒呼叫 Close 方法關閉。
程式碼不完善，僅供參考！需要做些修改才能應用於實際專案。

//索引修改器的獲取和關閉

import java.io.File;

import java.io.IOException;

import java.io.StringReader;

import java.util.ArrayList;

import java.util.HashMap;

import java.util.Map;

import org.apache.lucene.analysis.Analyzer;

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.index.CorruptIndexException;

import org.apache.lucene.index.IndexModifier;

import org.apache.lucene.store.Directory;

import org.apache.lucene.store.LockObtainFailedException;

import org.apache.lucene.store.RAMDirectory;

public class MyIndexModifier {

private static Analyzer analyzer = new StandardAnalyzer();

private static IndexModifier modifier;

private static ArrayList<Thread> threadList = new ArrayList<Thread>();

private MyIndexModifier() { }

static final File INDEX_DIR = new File("D:/docindex");

public static IndexModifier GetInstance()

{

synchronized (threadList)

{

if (modifier == null)

{

try {

modifier = new IndexModifier(INDEX_DIR, analyzer, false);

//索引效能測試引數配置

modifier.setMergeFactor(1000);

System.out.println("MergeFactor: " + modifier.getMergeFactor());

System.out.println("MaxBufferedDocs: " + modifier.getMaxBufferedDocs());

} catch (CorruptIndexException e) {

e.printStackTrace();

} catch (LockObtainFailedException e) {

e.printStackTrace();

} catch (IOException e) {

e.printStackTrace();

}

if (!threadList.contains(Thread.currentThread()))

threadList.add(Thread.currentThread());

return modifier;

}

public static void Close()

{

synchronized (threadList)

{

if (threadList.contains(Thread.currentThread()))

threadList.remove(Thread.currentThread());

if (threadList.size() == 0)

{

try {

if (modifier != null)

{

modifier.close();

modifier = null;

}

} catch (CorruptIndexException e) {

e.printStackTrace();

} catch (IOException e) {

e.printStackTrace();

}

//執行緒處理類

import java.io.IOException;

import java.util.Date;

import org.apache.log4j.LogManager;

import org.apache.log4j.Logger;

import org.apache.lucene.document.Document;

import org.apache.lucene.document.Field;

import org.apache.lucene.index.CorruptIndexException;

import org.apache.lucene.index.IndexModifier;

import org.apache.lucene.index.StaleReaderException;

import org.apache.lucene.store.LockObtainFailedException;

import com.miracle.dm.framework.common.TimestampConverter;

public class TestModifer extends Thread{

private static Logger logger = LogManager.getLogger(TestModifer.class);

@Override

public void run() {

IndexModifier writer = MyIndexModifier.GetInstance();

try {

writer.deleteDocument(0);

} catch (StaleReaderException e1) {

// TODO Auto-generated catch block

e1.printStackTrace();

} catch (CorruptIndexException e1) {

// TODO Auto-generated catch block

e1.printStackTrace();

} catch (LockObtainFailedException e1) {

// TODO Auto-generated catch block

e1.printStackTrace();

} catch (IOException e1) {

// TODO Auto-generated catch block

e1.printStackTrace();

}

for (int x = 0; x < 10; x++)

{

Document doc = new Document();

TimestampConverter converter = new TimestampConverter();

Date date = new Date();

String docDate = converter.timestampToShortStr(date);

doc.add(new Field("docDate", docDate , Field.Store.YES, Field.Index.TOKENIZED));

try {

writer.addDocument(doc);

} catch (CorruptIndexException e) {

// TODO Auto-generated catch block

e.printStackTrace();

} catch (LockObtainFailedException e) {

// TODO Auto-generated catch block

e.printStackTrace();

} catch (IOException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

logger.debug(""+ Thread.currentThread()+","+ writer.docCount());

MyIndexModifier.Close(); // 注意不是呼叫 IndexModifier.Close() ！

}

}
多執行緒測試程式碼

import java.io.Console;

import java.io.IOException;

import java.util.Date;

import org.apache.log4j.LogManager;

import org.apache.log4j.Logger;

import org.apache.lucene.document.Document;

import org.apache.lucene.document.Field;

import org.apache.lucene.index.CorruptIndexException;

import org.apache.lucene.index.IndexModifier;

import org.apache.lucene.store.LockObtainFailedException;

import com.miracle.dm.framework.common.TimestampConverter;

public class test {

private static Logger logger = LogManager.getLogger(test.class);

public test(){

}

/**

* @param args

public static void main(String[] args) {

for (int i = 0; i < 100; i++)

{

new TestModifer().start();

}

注意：使用lucene現在的新版本的朋友一定會發現，現在並不推薦使用。而檢視API發現IndexModifier已經被IndexWriter代替。再檢視IndexWriter，其中提供了新增，刪除，更新索引文件的方法。

這裡是自己編碼來實現，但是我不知道當幾千或更多使用者在對索引進行操作，那會不會導致close長時間沒有執行，而無法檢索到最新的更新索引。希望大家幫我考慮一下是否會存在這方面的問題，如果存在該如何解決？

方案二：利用已有的lucene框架，例如compass

它對lucene實現了實時索引。可基於hibernate，當更新資料庫時，系統會自動更新索引。

1.概述

Compass將lucene、Spring、Hibernate三者的起來，以很低很低的成本快速實現企業應用中的搜尋功能。

springside裡用了compass來做圖書搜尋，快速建立的流程如下：

1.用簡單的compass annotation把Book物件對映到Lucene。

2.配置compass預設提供的基於Spring MVC的Index Controller 和Search Controller。

3.編寫查詢結果的顯示頁面，將controller返回的變數顯示出來。

2.Object/Search Engine Mapping的 Annotations配置

使用JDK5 的annotation 來進行OSEM(Object/Search Engine Mapping)比用xml檔案按簡單許多，下面就是簡單的搜尋類，可見@SearchableID, @SearchableProperty與@SearchableComponent 三個標記，分別代表主鍵、可搜尋的屬性與關聯的，另一個可搜尋的物件，另外Compass要求POJO要有預設建構函式，要實現equals()和hashcode():

詳細請點選檢視springside中的Product.java , Book.java, Category.java

public class Product {

@SearchableId

private Integer id;

private Category category;

private String name;

private Double unitprice;

@SearchableProperty(name = "name")

public String getName() {

return this.name;

}

@SearchableComponent (refAlias = "category")

public Category getCategory() {

return this.category;

}

public Double getUnitprice() {

return this.unitprice;

}

3. 與spring,hibernate整合配置

3.1 spring配置檔案

hiberante中的sessionFactory,transactionManager相比大家也是輕車熟路了.這裡還是帶過(因為不牽扯稿費的問題嗎^_^ ).compass已經對對spring整合做了很好的封裝，讓我們的使用更加簡單,我們可以不為compass編寫一行程式碼,就可以做完搜尋引擎的檢索.下面是compass在spring中的簡明配置. 詳情點選檢視springside中的applicationContext-lucene.xml ：

<bean id="compass" class="org.compass.spring.LocalCompassBean">
   
<property name="classMappings">
     <list>
        <value>org.springside.bookstore.domain.Book</value>
     </list>
</property>

   <property name="compassSettings">
        <props>
            <prop key="compass.engine.connection">file://${user.home}/springside/compass</prop>
            <prop key="compass.transaction.factory">org.compass.spring.transaction.SpringSyncTransactionFactory</prop>
        </props>
    </property>

   <property name="transactionManager" ref="transactionManager"/>
</bean>
<bean id="hibernateGpsDevice" class="org.compass.spring.device.hibernate.SpringHibernate3GpsDevice">
     <property name="name">
        <value>hibernateDevice</value>
     </property>
     <property name="sessionFactory" ref="sessionFactory"/>
</bean>
<bean id="compassGps" class="org.compass.gps.impl.SingleCompassGps" init-method="start" destroy-method="stop">
<property name="compass" ref="compass"/>
<property name="gpsDevices">
      <list>
          <ref local="hibernateGpsDevice"/>
      </list>
</property>
</bean>
</beans>

上面要留意的配置有：

annotationConfiguration: 使用annotation配置，指定要轉換的POJO如Book
compass.engine.connection : 索引檔案在伺服器上的儲存路徑.
hibernateGpsDevice: 與hibernate的繫結，用Hibernate 3 事件系統,支援Real Time Data Mirroring .經Hiberante的資料改變會自動被反射到索引裡面.

3.2 web Controller的配置

兩個Controller都是現成的，只要配置相關選項即可。

詳情請檢視springside的bookstore-servlet.xml

<propertyname="searchView"value="/home/top.jsp"/>

</bean>

3.3 View JSP

簡單搜尋頁面:只需要一個query 引數:

結果頁面:

結果頁面將返回幾個變數，包括:

searchResults(搜尋結果) 包括hits(結果)和 searchtime(耗時)
pages(分頁資訊) 包括page_from page_to等
command(原來的查詢請求)

具體使用見springside的advancedSearch.jsp ,下面是簡版的程式碼:

<c:if test="${not empty searchResults}"> 耗時：

<c:choose>

<c:when test="${hit.alias == 'book'}">

<br/> 作者：${hit.data.author}<br/>

</div>

</c:when>

</c:choose>

</c:forEach>

</c:if>

4.擴充套件高階搜尋

擴充套件高階搜尋其實很簡單,SpringSide已經初步封裝了加入包含以下任意單詞，不包含以下任何單詞，分類選擇等條件及每頁顯示條數的確定。

如果需要更多條件：

1. 加強搜尋頁面，加入更多條件的顯示。

2. 擴充套件compass的command class，接受從搜尋條件頁傳過來的條件。可從springside的AdvancedSearchCommand 擴充套件或從Compass的原類擴充套件。

3. 擴充套件compass的searchController, 將command中的變數重新處理為一個符合Lucene語法規則的query變數即可(見springside中的AdvancedSearchController )，同時可以為搜尋條件頁查詢圖書分類列表一類的變數。

你可以從springside的AdvancedSearchController擴充套件，過載onSetupCommand (),參考父類的做法，加裝自己的條件。過載referenceData()，把圖書分類列表這種條件加入到AdvancedSearchCommand 的referenceData Map中供搜尋條件頁顯示，例子見BookSearchController。

也可以參考BookSearchController和AdvancedSearchController的做法，完全自行擴充套件。

Lucene多執行緒操作實現

Lucene多執行緒操作實現

【Dr.Chen的系列問題】Java多執行緒的實現操作

BackgroundWorker 實現多執行緒操作

通過實現runnable實現多執行緒操作

Thread三種實現&多執行緒操作同一物件的互斥同步以及多物件的同步&定時器Timer

【OS】PV操作-理髮師問題-VC++多執行緒模擬實現

一行 Python 實現並行化 -- 日常多執行緒操作的新思路

ios多執行緒操作（十二）—— 自定義NSOperation實現網路下載後回撥

按鍵精靈實現大漠多執行緒操作記事本

[Qt學習篇]Qthread實現多執行緒操作

【精】【多執行緒】ListenableFuture非同步多執行緒查詢實現

如何實現多執行緒？實現多執行緒為什麼要調start，而不是run方法？（繼承Thread類、實現Ruable介面、Callable<V>）

多執行緒之間實現同步

多執行緒 -- 多執行緒的實現

c#多執行緒操作測試（阻塞執行緒，結束任務）

多執行緒模擬實現生產者／消費者模型

Java8 並行流多執行緒操作

java多執行緒操作兩個資料庫.

多執行緒的實現方法

多執行緒的實現方式

Lucene多執行緒操作實現

相關推薦