1. 程式人生 > >9.Solr4.10.3數據導入(post.jar方式和curl方式)

9.Solr4.10.3數據導入(post.jar方式和curl方式)

order multicore aps start publish 所有 padding enca 頭信息

轉載請出自出處:http://www.cnblogs.com/hd3013779515/
1.使用post.jar方式
java -Durl=http://192.168.137.168:8080/solr/mycore/update -Ddata=files -jar /usr/local/solr-4.10.3/example/exampledocs/post.jar /usr/local/solr-4.10.3/example/multicore/exampledocs/ipod_other.xml

2.使用curl命令方式

刪除所有數據

curl http://192.168.137.168:8080/solr/mycore/update?commit=true -H "Content-Type: text/xml" --data-binary "<
delete><query>*:*</query></delete>"
導入XML文檔數據

curl http://192.168.137.168:8080/solr/mycore/update?commit=true --data-binary @/usr/local/solr-4.10.3/example/multicore/exampledocs/ipod_other.xml -H ‘Content-type:text/xml; charset=utf-8‘

導入json文檔數據

curl http://192.168.137.168:8080/solr/mycore/update?commit=true --data-binary @/home/test/books.json -H ‘Content-type:application/json; charset=utf-8‘

導入csv文檔數據

我們的csv(books.csv)文件的內容如下:

id,name,price,inStock,author,series_t,sequence_i,genre_s

0553573403,A Game of Thrones,7.99,true,George R.R. Martin,"A Song of Ice and Fire",1,fantasy

0553579908,A Clash of Kings,7.99,true,George R.R. Martin,"A Song of Ice and Fire",2,fantasy

055357342X,A Storm of Swords,7.99,true,George R.R. Martin,"A Song of Ice and Fire",3,fantasy

0553293354,Foundation,7.99,true,Isaac Asimov,Foundation Novels,1,scifi

0812521390,The Black Company,6.99,false,Glen Cook,The Chronicles of The Black Company,1,fantasy

0812550706,Ender‘s Game,6.99,true,Orson Scott Card,Ender,1,scifi

0441385532,Jhereg,7.95,false,Steven Brust,Vlad Taltos,1,fantasy

0380014300,Nine Princes In Amber,6.99,true,Roger Zelazny,the Chronicles of Amber,1,fantasy

0805080481,The Book of Three,5.99,true,Lloyd Alexander,The Chronicles of Prydain,1,fantasy

080508049X,The Black Cauldron,5.99,true,Lloyd Alexander,The Chronicles of Prydain,2,fantasy

為了能夠將上面的csv數據正確的導入,我們需要對solrconfig.xml文件進行如下修改:

<requestHandler name="/update/csv" class="solr.CSVRequestHandler" startup="lazy">
<lst name="defaults">
   <str name="separator">,</str>
   <str name="header">true</str>
   <str name="skip">genre_s</str>
   <str name="encapsulator">"</str>
</lst>
</requestHandler>

說明:

startup="lazy":通過該參數告訴solr在第一次添加時才實例化這個更新處理程序

<str name="separator">,</str> : 通過該參數告訴solr 字段之間是通過“,”分隔

<str name="header">true</str>:通過該參數告訴solr在數據項之前含有頭信息

<str name="skip">genre_s</str> :通過該參數告訴solr,publish_date 這列數據需要忽略掉

<str name="encapsulator">"</str>:通過該參數告訴solr數據項是通過雙引號(")進行封裝的

設置完畢,重啟solr,並提交數據:

curl http://192.168.137.168:8080/solr/mycore/update?commit=true --data-binary @/home/test/books.csv -H ‘Content-type:text/csv; charset=utf-8‘

9.Solr4.10.3數據導入(post.jar方式和curl方式)