Hive(一)---- Hive安裝及配置
Hive安裝及配置
下載hive安裝包
此處以hive-0.13.1-cdh5.3.6版本的為例,包名為:hive-0.13.1-cdh5.3.6.tar.gz
解壓Hive到安裝目錄
$ tar -xvf hive-0.13.1-cdh5.3.6.tar.gz
重命名配置文件
mv hive-default.xml.template hive-site.xml
mv hive-env.sh.template hive-env.sh
mv hive-log4j.properties.template hive-log4j.properties
hive-env.sh文件
JAVA_HOME=/usr/local/src/jdk1.8.0_121
HADOOP_HOME=/usr/local/src/hadoop-2.5.0-cdh5.3.6
export HIVE_CONF_DIR=/usr/local/src/hive-0.13.1-cdh5.3.6/conf
hive-site.xml文件
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/metastore?createDatabaseIfNotExist=true</value>
<description> JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<!--顯示數據庫名稱以及字段名稱-->
<!-- 是否在當前客戶端中顯示查詢出來的數據的字段名稱 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
<description>Whether to print the names of the columns in query output.</description>
</property>
<!-- 是否在當前客戶端中顯示當前所在數據庫名稱 -->
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether to include the current database in the Hive prompt.</description>
</property>
<!--簡單HiveSql繞過MR配置-->
<property>
<name>hive.fetch.task.conversion</name>
<value>more</value>
<description>
Some select queries can be converted to single FETCH task minimizing latency.
Currently the query should be single sourced not having any subquery and should not have
any aggregations or distincts (which incurs RS), lateral views and joins.
1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
2. more : SELECT, FILTER, LIMIT only (TABLESAMPLE, virtual columns)
</description>
</property>
註意:該版本中hive-site.xml文件在2787h行附近缺少了<property>標簽
hive-log4j.properties
hive.log.dir=/usr/local/src/hive-0.13.1-cdh5.3.6/logs
安裝Mysql
su - root
yum -y install mysql mysql-server mysql-devel
wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm
rpm -ivh mysql-community-release-el7-5.noarch.rpm
yum -y install mysql-community-server
配置Mysql
開啟Mysql服務
systemctl start mysqld.service
設置root用戶密碼
mysqladmin -uroot password ‘123456‘
為用戶以及其他機器節點授權
mysql grant all on . to root@‘master‘ identified by ‘123456‘;
grant:授權
all:所有權限
.:數據庫名稱.表名稱
root:操作mysql的用戶
@‘‘:主機名
密碼:123456
完成之後刷新:flush privileges;
拷貝數據庫驅動包到Hive根目錄下的lib文件夾
cp -a mysql-connector-java-5.1.27-bin.jar /usr/local/src/hive-0.13.1-cdh5.3.6/lib/
啟動Hive
bin/hive
修改HDFS系統中關於Hive的一些目錄權限
/usr/local/src/hadoop-2.7.2/bin/hadoop fs -chmod 777 /tmp/
/usr/local/src/hadoop-2.7.2/bin/hadoop fs -chmod 777 /user/hive/warehouse
創建數據庫
create database school;
創建表操作
create table t1(eid int, name string, sex string) row format delimited fields terminated by ‘\t‘;
導入數據到hive表
從本地導入:
load data local inpath ‘文件路徑‘ into table 庫名.表名;(此步驟已經將文件上傳到HDFS了)
從HDFS系統導入
Hive歷史命令存放地
cat ~/.hivehistory
主要用於排查邏輯錯誤或者查看常用命令
Hive臨時生效設置
固定語法:set 屬性名=屬性值
例如:set hive.cli.print.header=false;
Hive(一)---- Hive安裝及配置