linux下hive三種方式的安裝
本次以apache-hive-1.2.1-bin.tar.gz為例
伺服器node5192.168.13.135
伺服器node6192.168.13.136
伺服器node7192.168.13.137
伺服器node8192.168.13.138
一、配置本地內建derby模式
1.上傳hive至linux上(/opt/sxt/soft)
2.解壓tar -zxvf apache-hive-1.2.1-bin.tar.gz
3.配置檔案(/opt/sxt/soft/apache-hive-1.2.1-bin/conf)
1)cp hive-default.xml.template hive-site.xml
2)vi hive-site.xml
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=metastore_db;create=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.apache.derby.jdbc.EmbeddedDriver</value> </property> <property> <name>hive.metastore.local</name> <value>true</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property>
4.配置環境變數(並重新載入)
vi ~/.bash_profile
export HIVE_HOME=/opt/sxt/soft/apache-hive-1.2.1-bin
export PATH=$PATH:$HIVE_HOME/bin
source ~/.bash_profile
5.啟動hive
1)在啟動前需要成功啟動hdfs和yarn(start-all.sh start)
通過瀏覽器檢查node5/8:50070,node5/8:8088
2)修改HADOOP_HOME\lib目錄下的jline-*.jar 變成HIVE_HOME\lib下的jline-2.12.jar
確保hadoop和hive下的jline版本一致
hadoop下 jar包 路徑是:share/hadoop/yarn/lib
3)啟動:hive
注:使用derby儲存方式時,執行hive會在當前目錄生成一個derby檔案和一個metastore_db目錄。這種儲存方式的弊端是在同一個目錄下同時只能有一個hive客戶端能使用資料庫。
二、配置本地mysql模式(推薦使用,本地指的是能連線上的ip地址伺服器)
1.修改配置檔案
vi /opt/sxt/soft/apache-hive-1.2.1-bin/conf/hive-site.xml
刪除配置derby時的內容,改為
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive_remote/warehouse</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>1234</value>
</property>
2.拷貝mysql驅動jar包
需要將mysql(本次mysql-connector-java-5.1.32-bin.jar)的jar包拷貝到$HIVE_HOME/lib目錄下
3.啟動mysql
service mysqld start
4.啟動hive(確保資料庫已開啟)
hive
三、配置遠端mysql
1)remote一體(未親測)
這種儲存方式需要在遠端伺服器執行一個mysql伺服器,並且需要在Hive伺服器啟動meta服務。這裡用mysql的測試伺服器,
ip位192.168.13.138,新建hive_remote資料庫,字符集位latine1
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.13.135:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>1234</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.13.138:9083</value>
</property>
</configuration>
注:這裡把hive的服務端和客戶端都放在同一臺伺服器上了。服務端和客戶端可拆開。2)remote分開
1.如上,在node8(任意另一節點)上成功安裝hive
2.修改node8中的hive-site.xml(服務端配置檔案)
vi /opt/sxt/soft/apache-hive-1.2.1-bin/conf/hive-site.xml,內容為
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.13.135:3306/hive?createDatabaseIfNotExist=true</value>
//192.168.13.135:3306為安裝mysql的伺服器,不一定是135
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>1234</value>
</property>
</configuration>
3.啟動hive服務端程式(node8中)hive --service metastore
4.修改node5中的hive-site.xml(客戶端配置檔案)
vi /opt/sxt/soft/apache-hive-1.2.1-bin/conf/hive-site.xml,內容為
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.13.138:9083</value>
</property>
</configuration>
5.客戶端直接使用hive命令即可(node5中)hive
hive> show tables;
OK
Time taken: 0.707 seconds
hive>
6.啟動hiveserver庫
$HIVE_HOME/bin/hiveserver2或者$HIVE_HOME/bin/hive --service hiveserver2
7.連線hiveserver庫
1)beeline方式
beeline
beeline>!connect jdbc:hive2://localhost:10000 root org.apache.hive.jdbc.HiveDriver或
beeline>!connect jdbc:hive2://localhost:10000/default
2)客戶端連線(JDBC)
public class TestHive2 {
public static void main(String[] args) {
try {
Class.forName("org.apache.hive.jdbc.HiveDriver");
Connection conn = DriverManager.getConnection("jdbc:hive2://192.168.13.135:10000/default","root","");
Statement stmt = conn.createStatement();
ResultSet resultSet = stmt.executeQuery("select count(*) from people");
if(resultSet.next()){
System.out.println(resultSet.getInt(1));
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
補充:
1)load本地(local)資料到資料庫
load data local inpath ‘/opt/sxt/temp/test.txt’ into table people PARTITION (dt=’2016-1-1’);(直接複製會出錯)
2)舉例建幾個表(DDL)
1.CREATE TABLE page_view(
page_url STRING,
ip STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE;
資料如下:(-->表示製表符,即’\t’)
Node1-->192.168.13.1
Node2-->192.168.13.2
2.CREATE TABLE people(
id STRING,
name STRING,
likes Array<String>,
addr Map<String,String>
)
PARTITIONED BY(dt STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ‘,’
MAP KEYS TERMINATED BY ‘:’
STORED AS TEXTFILE;
資料如下:
1-->zs-->game,girl,money-->stuAddr:nantong,workAddr:tongzhou-->2017-1-1
2-->ls-->game,girl,money-->stuAddr:nantong,workAddr:tongzhou-->2017-1-1
select addr[‘stuAddr’] from people where name=’zs’;
3.從一個表中查詢資料放到另一表中
內表與外表的區別:內表drop後資料銷燬,外表drop後資料還在hdfs上。
內表資料由hive管理,外表資料存放在別處。
CREATE [EXTERNAL] TABLE people_test(
id STRING,
name STRING,
likes Array<String>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ‘,’
STORED AS TEXTFILE;
INSERT OVERWRITE TABLE people_test select id,name,likes FROM people where name=’zs’;
//把表清空
INSERT OVERWRITE TABLE people_test select id,name,likes FROM people where 1=2;
UPDATE people_test SET name = ‘ls’ where name=’zs’;
//刪除分割槽
ALTER TABLE people DROP IF EXISTS PARTITION (dt=’2016-1-1’);