Hive 之 Java API 操作

阿新 • • 發佈：2018-12-31

Java 想要訪問Hive，需要通過beeline的方式連線Hive，hiveserver2提供了一個新的命令列工具beeline，hiveserver2 對之前的hive做了升級，功能更加強大，它增加了許可權控制，要使用beeline需要先啟動hiverserver2，再使用beeline連線

1.beeline 的使用

啟動hiverserver2

$ hiveserver2

使用beeline連線hive

$ beeline -u jdbc:hive2://hdpcomprs:10000/db_comprs -n hadoop -p

這裡寫圖片描述

引數解釋：
-u：連線url，可以使用IP，也可以使用主機名，埠預設為10000
-n：連線的使用者名稱（注：不是登入hive的使用者名稱，是hive所在伺服器登入使用者名稱）
-p：密碼，可以不用輸入

可以使用如下命令來修改埠

hiveserver2 --hiveconf hive.server2.thrift.port=14000

如果不知道beeline怎麼使用，可以使用如下命令來檢視beeline的使用幫助

[[email protected] ~]$ beeline --help
Usage: java org.apache.hive.cli.beeline.BeeLine 
   -u <database url>               the JDBC URL to connect to
   -r                              reconnect to last saved connect url (in 
 conjunction with !save)
   -n <username>                   the username to connect as
   -p <password>                   the password to connect as
   -d <driver class>               the driver class to use
   -i <init file>                  script file for initialization
   -e <query>                      query that should be executed
 
   -f <exec file>                  script file that should be executed
   -w (or) --password-file <password file>  the password file to read password from
   --hiveconf property=value       Use value for given property
   --hivevar name=value            hive variable name and value
                                   This is Hive specific settings in which variables
                                   can be set at session level and referenced in Hive
                                   commands or queries.
   --property-file=<property-file> the file to read connection properties (url, driver, user, password) from
   --color=[true/false]            control whether color is used for display
   --showHeader=[true/false]       show column names in query results
   --headerInterval=ROWS;          the interval between which heades are displayed
   --fastConnect=[true/false]      skip building table/column list for tab-completion
   --autoCommit=[true/false]       enable/disable automatic transaction commit
   --verbose=[true/false]          show verbose error messages and debug info
   --showWarnings=[true/false]     display connection warnings
   --showDbInPrompt=[true/false]   display the current database name in the prompt
   --showNestedErrs=[true/false]   display nested errors
   --numberFormat=[pattern]        format numbers using DecimalFormat pattern
   --force=[true/false]            continue running script even after errors
   --maxWidth=MAXWIDTH             the maximum width of the terminal
   --maxColumnWidth=MAXCOLWIDTH    the maximum width to use when displaying columns
   --silent=[true/false]           be more silent
   --autosave=[true/false]         automatically save preferences
   --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for result display
                                   Note that csv, and tsv are deprecated - use csv2, tsv2 instead
   --incremental=[true/false]      Defaults to false. When set to false, the entire result set
                                   is fetched and buffered before being displayed, yielding optimal
                                   display column sizing. When set to true, result rows are displayed
                                   immediately as they are fetched, yielding lower latency and
                                   memory usage at the price of extra display column padding.
                                   Setting --incremental=true is recommended if you encounter an OutOfMemory
                                   on the client side (due to the fetched result set size being large).
                                   Only applicable if --outputformat=table.
   --incrementalBufferRows=NUMROWS the number of rows to buffer when printing rows on stdout,
                                   defaults to 1000; only applicable if --incremental=true
                                   and --outputformat=table
   --truncateTable=[true/false]    truncate table column when it exceeds length
   --delimiterForDSV=DELIMITER     specify the delimiter for delimiter-separated values output format (default: |)
   --isolation=LEVEL               set the transaction isolation level
   --nullemptystring=[true/false]  set to true to get historic behavior of printing null as empty string
   --maxHistoryRows=MAXHISTORYROWS The maximum number of rows to store beeline history.
   --help                          display this message

   Example:
    1. Connect using simple authentication to HiveServer2 on localhost:10000
    $ beeline -u jdbc:hive2://localhost:10000 username password

    2. Connect using simple authentication to HiveServer2 on hs.local:10000 using -n for username and -p for password
    $ beeline -n username -p password -u jdbc:hive2://hs2.local:10012

    3. Connect using Kerberos authentication with hive/[email protected] as HiveServer2 principal
    $ beeline -u "jdbc:hive2://hs2.local:10013/default;principal=hive/[email protected]"

    4. Connect using SSL connection to HiveServer2 on localhost at 10000
    $ beeline "jdbc:hive2://localhost:10000/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword"

    5. Connect using LDAP authentication
    $ beeline -u jdbc:hive2://hs2.local:10013/default <ldap-username> <ldap-password>

如果使用beeline連線時報瞭如下錯
hadoop is not allowed to impersonate hadoop (state=08S01,code=0)

原因：hiveserver2增加了許可權控制，需要在hadoop的配置檔案中配置
解決方法：在hadoop的core-site.xml中新增如下內容，然後重啟hadoop，再使用beeline連線即可

<property>
    <name>hadoop.proxyuser.hadoop.hosts</name>
    <value>*</value>
</property>
<property>
    <name>hadoop.proxyuser.hadoop.groups</name>
    <value>*</value>
</property>

連線成功後，和執行hive後相同執行shell命令即可，如果想要退出連線使用 !q 或 !quit 命令

2.Java API 操作 Hive

建立一個maven專案，pom.xml檔案配置如下

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.bigdata.hadoop</groupId>
    <artifactId>hive</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>2.3.0</version>
        </dependency>

        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.9</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.5.1</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

package com.bigdata.hadoop.hive;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import java.sql.*;

/**
 * JDBC 操作 Hive（注：JDBC 訪問 Hive 前需要先啟動HiveServer2）
 */
public class HiveJDBC {

    private static String driverName = "org.apache.hive.jdbc.HiveDriver";
    private static String url = "jdbc:hive2://hdpcomprs:10000/db_comprs";
    private static String user = "hadoop";
    private static String password = "";

    private static Connection conn = null;
    private static Statement stmt = null;
    private static ResultSet rs = null;

    // 載入驅動、建立連線
    @Before
    public void init() throws Exception {
        Class.forName(driverName);
        conn = DriverManager.getConnection(url,user,password);
        stmt = conn.createStatement();
    }

    // 建立資料庫
    @Test
    public void createDatabase() throws Exception {
        String sql = "create database hive_jdbc_test";
        System.out.println("Running: " + sql);
        stmt.execute(sql);
    }

    // 查詢所有資料庫
    @Test
    public void showDatabases() throws Exception {
        String sql = "show databases";
        System.out.println("Running: " + sql);
        rs = stmt.executeQuery(sql);
        while (rs.next()) {
            System.out.println(rs.getString(1));
        }
    }

    // 建立表
    @Test
    public void createTable() throws Exception {
        String sql = "create table emp(\n" +
                        "empno int,\n" +
                        "ename string,\n" +
                        "job string,\n" +
                        "mgr int,\n" +
                        "hiredate string,\n" +
                        "sal double,\n" +
                        "comm double,\n" +
                        "deptno int\n" +
                        ")\n" +
                     "row format delimited fields terminated by '\\t'";
        System.out.println("Running: " + sql);
        stmt.execute(sql);
    }

    // 查詢所有表
    @Test
    public void showTables() throws Exception {
        String sql = "show tables";
        System.out.println("Running: " + sql);
        rs = stmt.executeQuery(sql);
        while (rs.next()) {
            System.out.println(rs.getString(1));
        }
    }

    // 查看錶結構
    @Test
    public void descTable() throws Exception {
        String sql = "desc emp";
        System.out.println("Running: " + sql);
        rs = stmt.executeQuery(sql);
        while (rs.next()) {
            System.out.println(rs.getString(1) + "\t" + rs.getString(2));
        }
    }

    // 載入資料
    @Test
    public void loadData() throws Exception {
        String filePath = "/home/hadoop/data/emp.txt";
        String sql = "load data local inpath '" + filePath + "' overwrite into table emp";
        System.out.println("Running: " + sql);
        stmt.execute(sql);
    }

    // 查詢資料
    @Test
    public void selectData() throws Exception {
        String sql = "select * from emp";
        System.out.println("Running: " + sql);
        rs = stmt.executeQuery(sql);
        System.out.println("員工編號" + "\t" + "員工姓名" + "\t" + "工作崗位");
        while (rs.next()) {
            System.out.println(rs.getString("empno") + "\t\t" + rs.getString("ename") + "\t\t" + rs.getString("job"));
        }
    }

    // 統計查詢（會執行mapreduce作業）
    @Test
    public void countData() throws Exception {
        String sql = "select count(1) from emp";
        System.out.println("Running: " + sql);
        rs = stmt.executeQuery(sql);
        while (rs.next()) {
            System.out.println(rs.getInt(1) );
        }
    }

    // 刪除資料庫
    @Test
    public void dropDatabase() throws Exception {
        String sql = "drop database if exists hive_jdbc_test";
        System.out.println("Running: " + sql);
        stmt.execute(sql);
    }

    // 刪除資料庫表
    @Test
    public void deopTable() throws Exception {
        String sql = "drop table if exists emp";
        System.out.println("Running: " + sql);
        stmt.execute(sql);
    }

    // 釋放資源
    @After
    public void destory() throws Exception {
        if ( rs != null) {
            rs.close();
        }
        if (stmt != null) {
            stmt.close();
        }
        if (conn != null) {
            conn.close();
        }
    }
}

Hive 之 Java API 操作

1.beeline 的使用

2.Java API 操作 Hive

Hive 之 Java API 操作

大資料入門教程系列之Hive的Java API 操作

Hive之Java操作jdbc以及指令碼實現

Hadoop --- 入門之HDFS的JAVA API操作

java api操作hive遇到的問題

java api操作elasticsearch

使用ZooKeeper提供的Java API操作ZooKeeper

使用Java API操作zookeeper的acl權限

Hadoop_07_HDFS的Java API 操作

二：Java API操作HBase

一：Java API操作HBase

springboot上傳下載檔案（3）--java api 操作HDFS叢集+叢集配置

MongoDB之Java程式碼操作

mybatis入門（五）之Java API

Hadoop權威指南---HDFS簡介和java API操作

Java API 操作 zookeeper

HDFS客戶端搭建與Java API操作HDFS

HDFS Java API操作

Java API 操作 HDFS

hadoop2-HBase的Java API操作

Hive 之 Java API 操作

1.beeline 的 使用

2.Java API 操作 Hive

相關推薦

1.beeline 的使用