Solr之配置中文分詞器
阿新 • • 發佈:2019-02-02
1、使用solr自帶分詞器
1.1、拷貝Jar包
cp /opt/solr/solr-7.3.1/contrib/analysis-extras/lucene-libs/lucene-analyzers-smartcn-7.3.1.jar /opt/tomcat/apache-tomcat-8.5.31/webapps/solr/WEB-INF/lib
1.2、修改managed-schema
修改/opt/solr/solrhome/new_core/conf/managed-schema
檔案,並新增如下內容:
<fieldType name="text_ik_zd" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory" />
</analyzer>
</fieldType>
重啟tomcat即可.
2、配置IK中文分詞器
2.1、拷貝
cp solr-analyzer-ik-5.1.0.jar ik-analyzer-solr5-5.x.jar /opt/tomcat/apache-tomcat-8.5.31/webapps/solr/WEB-INF/lib
cp IKAnalyzer.cfg.xml ext.dic stopword.dic /opt/tomcat/apache-tomcat-8.5.31/webapps/solr/WEB-INF/classes
2.2、修改managed-schema
修改/opt/solr/solrhome/new_core/conf/managed-schema檔案,並新增如下內容:
<fieldType name="text_ik" class="solr.TextField">
<analyzer type="index">
<tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/>
</analyzer>
<analyzer type="query">
<tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/>
</analyzer>
</fieldType>