1. 程式人生 > >Solr之配置中文分詞器

Solr之配置中文分詞器

1、使用solr自帶分詞器

1.1、拷貝Jar包

cp /opt/solr/solr-7.3.1/contrib/analysis-extras/lucene-libs/lucene-analyzers-smartcn-7.3.1.jar /opt/tomcat/apache-tomcat-8.5.31/webapps/solr/WEB-INF/lib

1.2、修改managed-schema

修改/opt/solr/solrhome/new_core/conf/managed-schema檔案,並新增如下內容:

<fieldType name="text_ik_zd" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index"> <tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory"
/>
</analyzer> </fieldType>

重啟tomcat即可.

2、配置IK中文分詞器

2.1、拷貝

cp solr-analyzer-ik-5.1.0.jar ik-analyzer-solr5-5.x.jar /opt/tomcat/apache-tomcat-8.5.31/webapps/solr/WEB-INF/lib
cp IKAnalyzer.cfg.xml ext.dic stopword.dic /opt/tomcat/apache-tomcat-8.5.31/webapps/solr/WEB-INF/classes

2.2、修改managed-schema

修改/opt/solr/solrhome/new_core/conf/managed-schema檔案,並新增如下內容:

<fieldType name="text_ik" class="solr.TextField">
    <analyzer type="index">
      <tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/>
    </analyzer>
</fieldType>