1. 程式人生 > >Hive 安裝部署及測試

Hive 安裝部署及測試

標籤(空格分隔): hive
1) Hive 如何配置與Hadoop 關聯,進行建立表,載入資料測試
2) 在Linux下安裝MySQL資料庫
3) 配置Hive元資料儲存在MySQL中,檢視相關元資料表資訊
4) 熟悉基本的DML和DDL語句(建立資料庫、表及載入資料和基本查詢語句Select)

hadoop,spark,kafka交流群:224209501

1,相關文件及環境要求

1.1,相關文件

1.2,環境要求:

  • java 1.7
  • Hadoop 2.x (preferred)
  • 本文配置的環境是centos6.4

2,安裝hive

2.1 解壓eapache-hive-0.13.1-bin.tar.gz

$tar -zxvf apache-hive-0.13.1-bin.tar.gz ./

2.2 設定HIVE_HOME為hive安裝目錄

$ cd hive-0.13.1
$ export HIVE_HOME={{pwd}}

2.3 新增HIVE_HOME到系統PATH中

$ export PATH=$HIVE_HOME/bin:$PATH

3,執行hive

3.1 建立相關目錄

$bin/hdfs dfs -mkdir       /tmp
$bin/hdfs dfs -mkdir     -p  /user/hive/warehouse
$bin/hdfs dfs -chmod
g+w /tmp $bin/hdfs dfs -chmod g+w /user/hive/warehouse

3.2 修改配置檔案

在hive-env.xml檔案中新增如下內容:

# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/opt/modules/hadoop-2.5.0/

# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/opt/modules/hive-0.13.1/conf

3.3 執行hive命令介面

(1).進入hive shell

//hive安裝目錄
$bin/hive

(2).建立table 名為student。

>use default;
>create table student(id int,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
>show tables;

執行結束後:
建立一張表.jpg-15.8kB
(2).上傳資料

$vi stu.txt
//內容:
1001    sean
1002    jike
1003    tony
//上傳資料到表中
>load data local inpath '/opt/datas/stu.txt' into table student;

(3).查看錶中資料

>select * from student;

執行結束後:
查看錶中資料.jpg-9.3kB
(4).查看錶中某一個列的資料

>select id from student;

執行結束後:
檢視ID.jpg-47.6kB

3.4 執行 HiveServer2 and Beeline

//hive安裝目錄
$bin/hiveserver2
$bin/beeline -u jdbc:hive2://$HS2_HOST:$HS2_PORT

4,安裝mySQL

4.1 聯網安裝,使用yum

設定自動獲取IP地址

4.2 替換系統映象源

    $ cd /etc/yum.repos.d
    $ sudo mv CentOS-Base.repo CentOS-Base.repo.bak 
    $ sudo touch CentOS6-Base-163.repo
    $ sudo vi CentOS6-Base-163.repo
    //安裝完成後執行
    $ sudo yum clean all

4.3 安裝MYSQL

    $ sudo yum list|grep mysql
    $ sudo yum install mysql-server -y

4.4 啟動Mysql

    $ sudo service mysqld status
    $ sudo service mysqld start

4.5 設定密碼

    # /usr/bin/mysqladmin -u root password '123456'

4.8 設定開機啟動mysqld

$sudo chkconfig mysqld on
$sudo chkconfig --list |grep mysqld

執行結束後
開機啟動mysql.png-5.9kB

4.7 測試

    $ mysql -uroot -p123456    
    > show databases ;
    > use test ;
    > show tables ;

5,hive常用屬性配置

5.1 Hive hive-site.xml配置

首先建立vi hive-site.xml,並輸入任意字元。

<configuration>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://hadoop-miao.host.com/metastore_db?createDatabaseIfNotExist=true</value>
        <description>JDBC connect string for a JDBC metastore</description>
    </property>

    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>Driver class name for a JDBC metastore</description>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
        <description>username to use against metastore database</description>
    </property>

    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>123456</value>
        <description>password to use against metastore database</description>
    </property>

5.2 新增資料庫驅動

$tar -zxf  mysql-connector-java-5.1.27.tar.gz
$cd mysql-connector-java-5.1.27
$cp mysql-connector-java-5.1.27-bin.jar /opt/modules/hive-0.13.1/lib/

5.3 配置使用者連結許可權

$mysql -uroot -p123456

>use mysql;

>select User,Host,Password from user;

>update user set Host='%' where User = 'root' and Host='localhost';

>delete from user where user='root' and host='127.0.0.1';

>delete from user where user='root' and host='miaodonghua.host';

>delete from user where host='localhost';

>delete from user where host='hadoop-miao.host.com';

>flush privileges;

//檢視修改結果
>select User,Host,Password from user;

執行結束後
修改連結許可權結果.jpg-15.6kB

5.4 Hive資料倉庫位置配置

預設是:/user/hive/warehouse,可按照如下條件修改:

    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hadoop/warehouse</value>
        <description>location of default database for the warehouse</description>
    </property>
    注意許可權:
        $bin/hdfs dfs -mkdir -p /user/hive/warehouse
        $bin/hdfs dfs -chmod g+w /user/hive/warehouse

5.5 Hive執行日誌資訊位置

開啟hive-log4j.properties,由模板hive-log4j.properties.template修改
$vi $HIVE_HOME/conf/log4j.properties
//系統/tmp目錄
hive.log.dir=${java.io.tmpdir}/${user.name}
hive.log.file=hive.log
可修改為:
hive.log.dir=/opt/modules/hive-0.13.1/logs
hive.log.file=hive.log

5.6 指定hive執行時顯示的log日誌的級別

    $HIVE_HOME/conf/log4j.properties
    預設是hive.root.logger=INFO,DRFA
    可修改為hive.root.logger=DEBUG,DRFA
    DEBUG除錯時可用

5.7 在cli命令列上顯示當前資料庫,以及查詢表的行頭資訊

    $HIVE_HOME/conf/hive-site.xml
    //新增如下內容
    <property>
        <name>hive.cli.print.header</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.cli.print.current.db</name>
        <value>true</value>
    </property>

5.8 在啟動hive時設定配置屬性資訊

$bin/hive --hiveconf <property=value>
eg:
$bin/hive --hiveconf hive.cli.print.current.db=false
注意:此種方式設定的屬性值,僅僅在當前回話session生效.

5.9 hive配置優先順序的關係如下:

--hiveconf  >  hive-site.xml   >   hive-default.xml

5.10 檢視當前所有的配置資訊

    > set ;
    > set hive.cli.print.header ;  ## 獲取屬性值
    > set hive.cli.print.header = false ;  ## 設定屬性值

總共四種方式設定屬性值:

    set  > --hiveconf  >  hive-site.xml   >   hive-default.xml

6,Hive Shell常用操作

6.1 Hive 類sql語句常用操作

(1).建立表並查詢

>show databases ;
>create database db_hive ;  
>use db_hive ;
>create table student(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ; 
>load data inpath '/user/hive/warehouse/student/stu.txt' into table student ;
>select * from student ;
>select id from student ;

上述指令執行成功後,secureCRT結果如下:
建立檢視hive中的表資訊.png-39.2kB
可以在瀏覽器miaodonghua1.host:8088埠檢視:
8088埠檢視.png-25kB

(2).查看錶的描述資訊

>desc student ;
>desc extended student ;
>desc formatted student ;

執行成功後:
desc卡看錶的描述資訊.png-37.9kB

(3).函式的使用

>show functions ;
>desc function upper ;
>desc function extended upper ;
>select name, upper(name) upper_name from student ;

執行結束後:
函式upper的使用.png-29.4kB

6.2 在mysql中檢視hive元資料資訊

hive元資料資訊具體如下:
hive元資料資訊描述.png-45.6kB

>use metastore;
>select * from TBLS;
>select * from COLUMN_NAME_V2;
>select * from DBS

6.3 Hive Shell常用操作

(1).檢視命令幫助

$ bin/hive -help
usage: hive
 -d,--define <key=value>          Variable subsitution to apply to hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
 -h <hostname>                    connecting to Hive Server on remote host
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable subsitution to apply to hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -p <port>                        connecting to Hive Server on port number
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)

(2).登入指定資料庫

$bin/hive --database db_hive

登入指定資料庫.png-6.3kB

(3).直接查詢指定資料庫的指定表

$bin/hive -e "select * from db_hive.student ;"

檢視指定表.png-12.4kB

(4).執行hive指令碼

$bin/hive -f stu.sql
--stu.sql指令碼內容 
use db_hive ;
select *from student ;

執行sql指令碼.png-13.1kB

(5).重定向輸出結果

$bin/hive -f stu.sql  > /opt/datas/hivef-res.txt

重定向輸出結果到檔案.png-17.5kB

6.4 Hive常用互動式操作

>quit/exit ;//退出
>set key=value ;//修改特定變數的值
                //注意: 如果變數名拼寫錯誤,不會報錯
>set/set -v;//
>! <command>;//從Hive shell執行一個shell命令代表執行本地命令
>dfs <dfs command> ;//從Hive shell執行一個dfs命令
>query string ;//執行一個Hive查詢,然後輸出結果到標準輸出

錯誤修改

<property>
    <name>hive.metastore.uris</name>
    <value>thrift://miaodonghua1.host:9083</value>
</property>

配置這個要是不啟動metastore服務,會報如下錯誤:

Logging initialized using configuration in file:/opt/cdh2.3.6/hive-0.13.1-cdh5.3.6/conf/hive-log4j.properties
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:371)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1426)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:63)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:73)
        at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2625)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2644)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:365)
        ... 7 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1424)
        ... 12 more
Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
        at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:351)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:219)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1424)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:63)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:73)
        at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2625)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2644)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:365)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:579)
        at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
        ... 19 more
)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:398)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:219)
        ... 17 more