1. 程式人生 > >hive array、map、struct使用

hive array、map、struct使用

copyto .... del ted per where _id ans span

hive提供了復合數據類型:
Structs: structs內部的數據可以通過DOT(.)來存取,例如,表中一列c的類型為STRUCT{a INT; b INT},我們可以通過c.a來訪問域a
Maps(K-V對):訪問指定域可以通過["指定域名稱"]進行,例如,一個Map M包含了一個group-》gid的kv對,gid的值可以通過M[‘group‘]來獲取
Arrays:array中的數據為相同類型,例如,假如array A中元素[‘a‘,‘b‘,‘c‘],則A[1]的值為‘b‘

Struct使用

建表:

[plain] view plaincopy
  1. hive> create table student_test(id INT, info struct<name:STRING, age:INT>)
  2. > ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,‘
  3. > COLLECTION ITEMS TERMINATED BY ‘:‘;
  4. OK
  5. Time taken: 0.446 seconds

‘FIELDS TERMINATED BY‘ :字段與字段之間的分隔符
‘‘COLLECTION ITEMS TERMINATED BY‘ :一個字段各個item的分隔符
導入數據:

[plain] view plaincopy
  1. $ cat test5.txt
  2. 1,zhou:30
  3. 2,yan:30
  4. 3,chen:20
  5. 4,li:80
  6. hive> LOAD DATA LOCAL INPATH ‘/home/work/data/test5.txt‘ INTO TABLE student_test;
  7. Copying data from file:/home/work/data/test5.txt
  8. Copying file: file:/home/work/data/test5.txt
  9. Loading data to table default.student_test
  10. OK
  11. Time taken: 0.35 seconds

查詢:

[plain]
view plaincopy
  1. hive> select info.age from student_test;
  2. Total MapReduce jobs = 1
  3. ......
  4. Total MapReduce CPU Time Spent: 490 msec
  5. OK
  6. 30
  7. 30
  8. 20
  9. 80
  10. Time taken: 21.677 seconds


Array使用
建表:

[plain] view plaincopy
  1. hive> create table class_test(name string, student_id_list array<INT>)
  2. > ROW FORMAT DELIMITED
  3. > FIELDS TERMINATED BY ‘,‘
  4. > COLLECTION ITEMS TERMINATED BY ‘:‘;
  5. OK
  6. Time taken: 0.099 seconds

導入數據:

[plain] view plaincopy
  1. $ cat test6.txt
  2. 034,1:2:3:4
  3. 035,5:6
  4. 036,7:8:9:10
  5. hive> LOAD DATA LOCAL INPATH ‘/home/work/data/test6.txt‘ INTO TABLE class_test ;
  6. Copying data from file:/home/work/data/test6.txt
  7. Copying file: file:/home/work/data/test6.txt
  8. Loading data to table default.class_test
  9. OK
  10. Time taken: 0.198 seconds

查詢:

[plain] view plaincopy
  1. hive> select student_id_list[3] from class_test;
  2. Total MapReduce jobs = 1
  3. ......
  4. Total MapReduce CPU Time Spent: 480 msec
  5. OK
  6. 4
  7. NULL
  8. 10
  9. Time taken: 21.574 seconds


Map使用
建表:

[plain] view plaincopy
  1. hive> create table employee(id string, perf map<string, int>)
  2. > ROW FORMAT DELIMITED
  3. > FIELDS TERMINATED BY ‘\t‘
  4. > COLLECTION ITEMS TERMINATED BY ‘,‘
  5. > MAP KEYS TERMINATED BY ‘:‘;
  6. OK
  7. Time taken: 0.144 seconds

‘MAP KEYS TERMINATED BY’ :key value分隔符

導入數據:

[plain] view plaincopy
  1. $ cat test7.txt
  2. 1 job:80,team:60,person:70
  3. 2 job:60,team:80
  4. 3 job:90,team:70,person:100
  5. hive> LOAD DATA LOCAL INPATH ‘/home/work/data/test7.txt‘ INTO TABLE employee;

查詢:

[plain] view plaincopy
      1. hive> select perf[‘person‘] from employee;
      2. Total MapReduce jobs = 1
      3. ......
      4. Total MapReduce CPU Time Spent: 460 msec
      5. OK
      6. 70
      7. NULL
      8. 100
      9. Time taken: 20.902 seconds
      10. hive> select perf[‘person‘] from employee where perf[‘person‘] is not null;
      11. Total MapReduce jobs = 1
      12. .......
      13. Total MapReduce CPU Time Spent: 610 msec
      14. OK
      15. 70
      16. 100
      17. Time taken: 21.989 seconds
      18. hive>
      19. <span style="font-family:Arial, Helvetica, sans-serif;"><span style="white-space: normal;">
      20. </span></span>

hive array、map、struct使用