Hive學習之路（六）Hive的DDL操作

阿新 • • 發佈：2018-04-05

存儲位置 BE 輔助 cond 允許 param 就是文件夾 selected

庫操作

1、創建庫

語法結構

CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name

　　[COMMENT database_comment]　　　　　　//關於數據塊的描述

　　[LOCATION hdfs_path]　　　　　　　　　　//指定數據庫在HDFS上的存儲位置

　　[WITH DBPROPERTIES (property_name=property_value, ...)];　　　　//指定數據塊屬性

　　默認地址：/user/hive/warehouse/db_name.db/table_name/partition_name/…

創建庫的方式

（1）創建普通的數據庫

0: jdbc:hive2://hadoop3:10000> create database t1;
No rows affected (0.308 seconds)
0: jdbc:hive2://hadoop3:10000> show databases;
+----------------+
| database_name  |
+----------------+
| default        |
| myhive         |
| t1             |
+----------------+
3 rows selected (0.393 
 seconds)
0: jdbc:hive2://hadoop3:10000>

（2）創建庫的時候檢查存與否

0: jdbc:hive2://hadoop3:10000> create database if not exists t1;
No rows affected (0.176 seconds)
0: jdbc:hive2://hadoop3:10000>

（3）創建庫的時候帶註釋

0: jdbc:hive2://hadoop3:10000> create database if not exists t2 comment ‘learning hive‘;
No rows affected (0.217 seconds)

0: jdbc:hive2://hadoop3:10000>

技術分享圖片

（4）創建帶屬性的庫

0: jdbc:hive2://hadoop3:10000> create database if not exists t3 with dbproperties(‘creator‘=‘hadoop‘,‘date‘=‘2018-04-05‘);
No rows affected (0.255 seconds)
0: jdbc:hive2://hadoop3:10000>

2、查看庫

查看庫的方式

（1）查看有哪些數據庫

0: jdbc:hive2://hadoop3:10000> show databases;
+----------------+
| database_name |
+----------------+
| default |
| myhive |
| t1 |
| t2 |
| t3 |
+----------------+
5 rows selected (0.164 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

（2）顯示數據庫的詳細屬性信息

語法

desc database [extended] dbname;

示例

0: jdbc:hive2://hadoop3:10000> desc database extended t3;
+----------+----------+------------------------------------------+-------------+-------------+------------------------------------+
| db_name  | comment  |                 location                 | owner_name  | owner_type  |             parameters             |
+----------+----------+------------------------------------------+-------------+-------------+------------------------------------+
| t3       |          | hdfs://myha01/user/hive/warehouse/t3.db  | hadoop      | USER        | {date=2018-04-05, creator=hadoop}  |
+----------+----------+------------------------------------------+-------------+-------------+------------------------------------+
1 row selected (0.11 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

（3）查看正在使用哪個庫

0: jdbc:hive2://hadoop3:10000> select current_database();
+----------+
|   _c0    |
+----------+
| default  |
+----------+
1 row selected (1.36 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

（4）查看創建庫的詳細語句

0: jdbc:hive2://hadoop3:10000> show create database t3;
+----------------------------------------------+
|                createdb_stmt                 |
+----------------------------------------------+
| CREATE DATABASE `t3`                         |
| LOCATION                                     |
|   ‘hdfs://myha01/user/hive/warehouse/t3.db‘  |
| WITH DBPROPERTIES (                          |
|   ‘creator‘=‘hadoop‘,                        |
|   ‘date‘=‘2018-04-05‘)                       |
+----------------------------------------------+
6 rows selected (0.155 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

3、刪除庫

說明

刪除庫操作

drop database dbname;
drop database if exists dbname;

默認情況下，hive 不允許刪除包含表的數據庫，有兩種解決辦法：

1、手動刪除庫下所有表，然後刪除庫

2、使用 cascade 關鍵字

drop database if exists dbname cascade;

默認情況下就是 restrict drop database if exists myhive ==== drop database if exists myhive restrict

示例

（1）刪除不含表的數據庫

0: jdbc:hive2://hadoop3:10000> show tables in t1;
+-----------+
| tab_name  |
+-----------+
+-----------+
No rows selected (0.147 seconds)
0: jdbc:hive2://hadoop3:10000> drop database t1;
No rows affected (0.178 seconds)
0: jdbc:hive2://hadoop3:10000> show databases;
+----------------+
| database_name  |
+----------------+
| default        |
| myhive         |
| t2             |
| t3             |
+----------------+
4 rows selected (0.124 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

（2）刪除含有表的數據庫

0: jdbc:hive2://hadoop3:10000> drop database if exists t3 cascade;
No rows affected (1.56 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

4、切換庫

語法

use database_name

示例

0: jdbc:hive2://hadoop3:10000> use t2;
No rows affected (0.109 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

表操作

1、創建表

語法

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name

　　[(col_name data_type [COMMENT col_comment], ...)]

　　[COMMENT table_comment]

　　[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]

　　[CLUSTERED BY (col_name, col_name, ...)

　　　　[SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]

　　[ROW FORMAT row_format]

　　[STORED AS file_format]

　　[LOCATION hdfs_path]

詳情請參見： https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualD DL-CreateTable

•CREATE TABLE 創建一個指定名字的表。如果相同名字的表已經存在，則拋出異常；用戶可以用 IF NOT EXIST 選項來忽略這個異常
•EXTERNAL 關鍵字可以讓用戶創建一個外部表，在建表的同時指定一個指向實際數據的路徑（LOCATION）
•LIKE 允許用戶復制現有的表結構，但是不復制數據
•COMMENT可以為表與字段增加描述

•PARTITIONED BY 指定分區
•ROW FORMAT 
　　DELIMITED [FIELDS TERMINATED BY char] [COLLECTION ITEMS TERMINATED BY char] 
　　　　MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char] 
　　　　| SERDE serde_name [WITH SERDEPROPERTIES 
　　　　(property_name=property_value, property_name=property_value, ...)] 
　　用戶在建表的時候可以自定義 SerDe 或者使用自帶的 SerDe。如果沒有指定 ROW FORMAT 或者 ROW FORMAT DELIMITED，將會使用自帶的 SerDe。在建表的時候，
用戶還需要為表指定列，用戶在指定表的列的同時也會指定自定義的 SerDe，Hive 通過 SerDe 確定表的具體的列的數據。 
•STORED AS 
　　SEQUENCEFILE //序列化文件
　　| TEXTFILE //普通的文本文件格式
　　| RCFILE　　//行列存儲相結合的文件
　　| INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname //自定義文件格式
　　如果文件數據是純文本，可以使用 STORED AS TEXTFILE。如果數據需要壓縮，使用 STORED AS SEQUENCE 。

•LOCATION指定表在HDFS的存儲路徑

最佳實踐：
　　如果一份數據已經存儲在HDFS上，並且要被多個用戶或者客戶端使用，最好創建外部表
　　反之，最好創建內部表。

　　如果不指定，就按照默認的規則存儲在默認的倉庫路徑中。

示例

使用t2數據庫進行操作

（1）創建默認的內部表

0: jdbc:hive2://hadoop3:10000> create table student(id int, name string, sex string, age int,department string) row format delimited fields terminated by ",";
No rows affected (0.222 seconds)
0: jdbc:hive2://hadoop3:10000> desc student;
+-------------+------------+----------+
|  col_name   | data_type  | comment  |
+-------------+------------+----------+
| id          | int        |          |
| name        | string     |          |
| sex         | string     |          |
| age         | int        |          |
| department  | string     |          |
+-------------+------------+----------+
5 rows selected (0.168 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

（2）外部表

0: jdbc:hive2://hadoop3:10000> create external table student_ext
(id int, name string, sex string, age int,department string) row format delimited fields terminated by "," location "/hive/student";
No rows affected (0.248 seconds)
0: jdbc:hive2://hadoop3:10000>

（3）分區表

0: jdbc:hive2://hadoop3:10000> create external table student_ptn(id int, name string, sex string, age int,department string)
. . . . . . . . . . . . . . .> partitioned by (city string)
. . . . . . . . . . . . . . .> row format delimited fields terminated by ","
. . . . . . . . . . . . . . .> location "/hive/student_ptn";
No rows affected (0.24 seconds)
0: jdbc:hive2://hadoop3:10000>

添加分區

0: jdbc:hive2://hadoop3:10000> alter table student_ptn add partition(city="beijing");
No rows affected (0.269 seconds)
0: jdbc:hive2://hadoop3:10000> alter table student_ptn add partition(city="shenzhen");
No rows affected (0.236 seconds)
0: jdbc:hive2://hadoop3:10000>

如果某張表是分區表。那麽每個分區的定義，其實就表現為了這張表的數據存儲目錄下的一個子目錄
如果是分區表。那麽數據文件一定要存儲在某個分區中，而不能直接存儲在表中。

（4）分桶表

0: jdbc:hive2://hadoop3:10000> create external table student_bck(id int, name string, sex string, age int,department string)
. . . . . . . . . . . . . . .> clustered by (id) sorted by (id asc, name desc) into 4 buckets
. . . . . . . . . . . . . . .> row format delimited fields terminated by ","
. . . . . . . . . . . . . . .> location "/hive/student_bck";
No rows affected (0.216 seconds)
0: jdbc:hive2://hadoop3:10000>

（5）使用CTAS創建表

作用：就是從一個查詢SQL的結果來創建一個表進行存儲

現象student表中導入數據

0: jdbc:hive2://hadoop3:10000> load data local inpath "/home/hadoop/student.txt" into table student;
No rows affected (0.715 seconds)
0: jdbc:hive2://hadoop3:10000> select * from student;
+-------------+---------------+--------------+--------------+---------------------+
| student.id  | student.name  | student.sex  | student.age  | student.department  |
+-------------+---------------+--------------+--------------+---------------------+
| 95002       | 劉晨            | 女            | 19           | IS                  |
| 95017       | 王風娟           | 女            | 18           | IS                  |
| 95018       | 王一            | 女            | 19           | IS                  |
| 95013       | 馮偉            | 男            | 21           | CS                  |
| 95014       | 王小麗           | 女            | 19           | CS                  |
| 95019       | 邢小麗           | 女            | 19           | IS                  |
| 95020       | 趙錢            | 男            | 21           | IS                  |
| 95003       | 王敏            | 女            | 22           | MA                  |
| 95004       | 張立            | 男            | 19           | IS                  |
| 95012       | 孫花            | 女            | 20           | CS                  |
| 95010       | 孔小濤           | 男            | 19           | CS                  |
| 95005       | 劉剛            | 男            | 18           | MA                  |
| 95006       | 孫慶            | 男            | 23           | CS                  |
| 95007       | 易思玲           | 女            | 19           | MA                  |
| 95008       | 李娜            | 女            | 18           | CS                  |
| 95021       | 周二            | 男            | 17           | MA                  |
| 95022       | 鄭明            | 男            | 20           | MA                  |
| 95001       | 李勇            | 男            | 20           | CS                  |
| 95011       | 包小柏           | 男            | 18           | MA                  |
| 95009       | 夢圓圓           | 女            | 18           | MA                  |
| 95015       | 王君            | 男            | 18           | MA                  |
+-------------+---------------+--------------+--------------+---------------------+
21 rows selected (0.342 seconds)
0: jdbc:hive2://hadoop3:10000>

使用CTAS創建表

0: jdbc:hive2://hadoop3:10000> create table student_ctas as select * from student where id < 95012;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution 
engine (i.e. spark, tez) or using Hive 1.X releases.
No rows affected (34.514 seconds)
0: jdbc:hive2://hadoop3:10000> select * from student_ctas
. . . . . . . . . . . . . . .> ;
+------------------+--------------------+-------------------+-------------------+--------------------------+
| student_ctas.id  | student_ctas.name  | student_ctas.sex  | student_ctas.age  | student_ctas.department  |
+------------------+--------------------+-------------------+-------------------+--------------------------+
| 95002            | 劉晨                 | 女                 | 19                | IS                       |
| 95003            | 王敏                 | 女                 | 22                | MA                       |
| 95004            | 張立                 | 男                 | 19                | IS                       |
| 95010            | 孔小濤                | 男                 | 19                | CS                       |
| 95005            | 劉剛                 | 男                 | 18                | MA                       |
| 95006            | 孫慶                 | 男                 | 23                | CS                       |
| 95007            | 易思玲                | 女                 | 19                | MA                       |
| 95008            | 李娜                 | 女                 | 18                | CS                       |
| 95001            | 李勇                 | 男                 | 20                | CS                       |
| 95011            | 包小柏                | 男                 | 18                | MA                       |
| 95009            | 夢圓圓                | 女                 | 18                | MA                       |
+------------------+--------------------+-------------------+-------------------+--------------------------+
11 rows selected (0.445 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

（6）復制表結構

0: jdbc:hive2://hadoop3:10000> create table student_copy like student;
No rows affected (0.217 seconds)
0: jdbc:hive2://hadoop3:10000>

註意：

如果在table的前面沒有加external關鍵字，那麽復制出來的新表。無論如何都是內部表
如果在table的前面有加external關鍵字，那麽復制出來的新表。無論如何都是外部表

技術分享圖片

2、查看表

（1）查看表列表

查看當前使用的數據庫中有哪些表

0: jdbc:hive2://hadoop3:10000> show tables;
+---------------+
|   tab_name    |
+---------------+
| student       |
| student_bck   |
| student_copy  |
| student_ctas  |
| student_ext   |
| student_ptn   |
+---------------+
6 rows selected (0.163 seconds)
0: jdbc:hive2://hadoop3:10000>

查看非當前使用的數據庫中有哪些表

0: jdbc:hive2://hadoop3:10000> show tables in myhive;
+-----------+
| tab_name  |
+-----------+
| student   |
+-----------+
1 row selected (0.144 seconds)
0: jdbc:hive2://hadoop3:10000>

查看數據庫中以xxx開頭的表

0: jdbc:hive2://hadoop3:10000> show tables like ‘student_c*‘;
+---------------+
|   tab_name    |
+---------------+
| student_copy  |
| student_ctas  |
+---------------+
2 rows selected (0.13 seconds)
0: jdbc:hive2://hadoop3:10000>

（2）查看表的詳細信息

查看表的信息

0: jdbc:hive2://hadoop3:10000> desc student;
+-------------+------------+----------+
|  col_name   | data_type  | comment  |
+-------------+------------+----------+
| id          | int        |          |
| name        | string     |          |
| sex         | string     |          |
| age         | int        |          |
| department  | string     |          |
+-------------+------------+----------+
5 rows selected (0.149 seconds)
0: jdbc:hive2://hadoop3:10000>

查看表的詳細信息（格式不友好）

0: jdbc:hive2://hadoop3:10000> desc extended student;

技術分享圖片

查看表的詳細信息（格式友好）

0: jdbc:hive2://hadoop3:10000> desc formatted student;

技術分享圖片

查看分區信息

0: jdbc:hive2://hadoop3:10000> show partitions student_ptn;

技術分享圖片

（3）查看表的詳細建表語句

0: jdbc:hive2://hadoop3:10000> show create table student_ptn;

技術分享圖片

3、修改表

（1）修改表名

0: jdbc:hive2://hadoop3:10000> alter table student rename to new_student;

技術分享圖片

（2）修改字段定義

A. 增加一個字段

0: jdbc:hive2://hadoop3:10000> alter table new_student add columns (score int);

技術分享圖片

B. 修改一個字段的定義

0: jdbc:hive2://hadoop3:10000> alter table new_student change name new_name string;

技術分享圖片

C. 刪除一個字段

不支持

D. 替換所有字段

0: jdbc:hive2://hadoop3:10000> alter table new_student replace columns (id int, name string, address string);

技術分享圖片

（3）修改分區信息

A. 添加分區

靜態分區

　　添加一個

0: jdbc:hive2://hadoop3:10000> alter table student_ptn add partition(city="chongqing");

　　添加多個

0: jdbc:hive2://hadoop3:10000> alter table student_ptn add partition(city="chongqing2") partition(city="chongqing3") partition(city="chongqing4");

動態分區

先向student_ptn表中插入數據，數據格式如下圖

0: jdbc:hive2://hadoop3:10000> load data local inpath "/home/hadoop/student.txt" into table student_ptn partition(city="beijing");

技術分享圖片

現在我把這張表的內容直接插入到另一張表student_ptn_age中，並實現sex為動態分區（不指定到底是哪中性別，讓系統自己分配決定）

首先創建student_ptn_age並指定分區為age

0: jdbc:hive2://hadoop3:10000> create table student_ptn_age(id int,name string,sex string,department string) partitioned by (age int);

從student_ptn表中查詢數據並插入student_ptn_age表中

0: jdbc:hive2://hadoop3:10000> insert overwrite table student_ptn_age partition(age)
. . . . . . . . . . . . . . .> select id,name,sex,department，age from student_ptn;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
No rows affected (27.905 seconds)
0: jdbc:hive2://hadoop3:10000>

技術分享圖片

B. 修改分區

修改分區，一般來說，都是指修改分區的數據存儲目錄

在添加分區的時候，直接指定當前分區的數據存儲目錄

0: jdbc:hive2://hadoop3:10000> alter table student_ptn add if not exists partition(city=‘beijing‘) 
. . . . . . . . . . . . . . .> location ‘/student_ptn_beijing‘ partition(city=‘cc‘) location ‘/student_cc‘;
No rows affected (0.306 seconds)
0: jdbc:hive2://hadoop3:10000>

修改已經指定好的分區的數據存儲目錄

0: jdbc:hive2://hadoop3:10000> alter table student_ptn partition (city=‘beijing‘) set location ‘/student_ptn_beijing‘;

此時原先的分區文件夾仍存在，但是在往分區添加數據時，只會添加到新的分區目錄

技術分享圖片

C. 刪除分區

0: jdbc:hive2://hadoop3:10000> alter table student_ptn drop partition (city=‘beijing‘);

技術分享圖片

4、刪除表

0: jdbc:hive2://hadoop3:10000> drop table new_student;

技術分享圖片

5、清空表

0: jdbc:hive2://hadoop3:10000> truncate table student_ptn;

其他輔助命令

技術分享圖片

Hive學習之路（六）Hive的DDL操作

存儲位置 BE 輔助 cond 允許 param 就是文件夾 selected 庫操作 1、創建庫語法結構 CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name 　　[COMMENT database_

Hive學習之路 （六）Hive的DDL操作

庫操作

1、創建庫

語法結構

創建庫的方式

（1）創建普通的數據庫

（2）創建庫的時候檢查存與否

（3）創建庫的時候帶註釋

（4）創建帶屬性的庫

2、查看庫

查看庫的方式

（1）查看有哪些數據庫

（2）顯示數據庫的詳細屬性信息

（3）查看正在使用哪個庫

（4）查看創建庫的詳細語句

3、刪除庫

說明

示例

（1）刪除不含表的數據庫

（2）刪除含有表的數據庫

4、切換庫

語法

示例

表操作

1、創建表

語法

示例

（1）創建默認的內部表

（2）外部表

（3）分區表

（4）分桶表

（5）使用CTAS創建表

（6）復制表結構

2、查看表

（1）查看表列表

查看當前使用的數據庫中有哪些表

查看非當前使用的數據庫中有哪些表

查看數據庫中以xxx開頭的表

（2）查看表的詳細信息

查看表的信息

查看表的詳細信息（格式不友好）

查看表的詳細信息（格式友好）

查看分區信息

（3）查看表的詳細建表語句

3、修改表

（1）修改表名

（2）修改字段定義

A. 增加一個字段

B. 修改一個字段的定義

C. 刪除一個字段

D. 替換所有字段

（3）修改分區信息

A. 添加分區

B. 修改分區

C. 刪除分區

4、刪除表

5、清空表

其他輔助命令

相關推薦

Hive學習之路（六）Hive的DDL操作