1. 程式人生 > >Hive 2.3.3 安裝教程

Hive 2.3.3 安裝教程

Hive安裝教程

Hive不支援記錄級別的更新、插入或者刪除操作。Hive查詢延時比較嚴重。Hive不支援事務。
Pig常用於ETL(資料抽取,資料轉換和資料裝載)過程的一部分,也就是將外部資料裝載到Hadoop叢集中,然後轉換成所期望的資料格式。
大前提,已經裝有hadoop,並且配置了環境變數$HADOOP_HOME

下載安裝包

https://hive.apache.org/downloads.html

本文以apache-hive-2.3.3-bin.tar.gz為例

解壓

Tar -zxvf apache-hive-2.3.3-bin.tar.gz

增加環境變數

在.bashrc或者profile檔案中增加以下內容,本文在.bashrc檔案中增加內容

export HIVE_HOME={{pwd}}
export PATH=$HIVE_HOME/bin:$PATH

其中{{pwd}}為hive檔案路徑

在HDFS中建立資料夾tmp(一般預設就有)和warehouse

(In addition, you must use below HDFS commands to create /tmp and /user/hive/warehouse (aka hive.metastore.warehouse.dir) and set them chmod g+w before you can create a table in Hive.)

  $ $HADOOP_HOME/bin/hdfs dfs -mkdir       /tmp
  $ $HADOOP_HOME/bin/hdfs dfs -mkdir -p /user/hive/warehouse
  $ $HADOOP_HOME/bin/hdfs dfs -chmod g+w   /tmp
  $ $HADOOP_HOME/bin/hdfs dfs -chmod g+w   /user/hive/warehouse

在conf目錄下新建hive-site.xml(參考hive-default.xml.template),增加連線資料庫的配置,如下所示

<property>
  <name>javax.jdo.option.ConnectionURL</name>   <value>jdbc:mysql://127.0.0.1:3306/hive?createDatabaseIfNotExist=true</value> </property> <property>   <name>javax.jdo.option.ConnectionDriverName</name>   <value>com.mysql.jdbc.Driver</value> </property> <property>   <name>javax.jdo.option.ConnectionUserName</name>   <value>root</value> </property> <property>   <name>javax.jdo.option.ConnectionPassword</name>   <value>123456</value> </property>

初始化schema

bin/schematool -initSchema -dbType mysql

測試

使用bin/hive進行hive的CLI,輸入以下命令進行測試:

CREATE TABLE x (a INT);
SELECT * FROM x;
DROP TABLE x;

問題

bin/hive啟動後報如下錯誤

Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
    at org.apache.hadoop.fs.Path.initialize(Path.java:206)
    at org.apache.hadoop.fs.Path.<init>(Path.java:172)
    at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:659)
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:582)
    at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:549)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:750)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
    at java.net.URI.checkPath(URI.java:1823)
    at java.net.URI.<init>(URI.java:745)
    at org.apache.hadoop.fs.Path.initialize(Path.java:203)
    ... 12 more

修改配置檔案中:

  <property>
    <name>hive.querylog.location</name>
    <value>${system:java.io.tmpdir}/${system:user.name}</value>
    <description>Location of Hive run time structured log file</description>
  </property>

<value>中的值包含${system:java.io.tmpdir}${system:user.name}修改為具體的路徑