1. 程式人生 > >Python操作Hive的兩種方法總結

Python操作Hive的兩種方法總結

方法一:使用PyHive庫

安裝依賴包:其中sasl安裝可能會報錯,可以去https://www.lfd.uci.edu/~gohlke/pythonlibs/#sasl下載對應版本安裝。

pip install sasl
pip install thrift
pip install thrift-sasl
pip install PyHive

Python指令碼程式碼操作:

from pyhive import hive   # or import hive
conn = hive.Connection(host='****', port=****, username='****', database='****')
cursor.execute(''SELECT * FROM my_awesome_data LIMIT 10'')
for i in range(****):
    sql = "INSERT INTO **** VALUES ({},'username{}')".format(value, str(username))
    cursor.execute(sql)


# 下面是官網程式碼:
from pyhive import presto  # or import hive
cursor = presto.connect('localhost').cursor()
cursor.execute('SELECT * FROM my_awesome_data LIMIT 10')
print(cursor.fetchone())
print(cursor.fetchall())

方法二:使用 impyla庫

impyla依賴包:

pip install six

pip install bit_array

pip install thriftpy

為了支援Hive還需要以下兩個包:

pip install sasl
pip install thrift_sasl

可在Python PyPI中下載impyla及其依賴包的原始碼

Python指令碼程式碼:

from impala.dbapi import connect 
conn = connect(host ='****',port = ****)
cursor = conn.cursor()
cursor.execute('SELECT * FROM mytable LIMIT 100')
print cursor.description   # 列印結果集的schema 
results = cursor.fetchall()