hive的streaming:an error occurred when trying to close the Operator running your custom script.
阿新 • • 發佈:2018-12-15
在對hive 進行 select 查詢的時候 我們可以編寫 python 、php 、perl等指令碼來進行相應的資料處理,我們要用到hive 的 transform 和 using。
在使用的時候容易報如圖所示的錯誤:an error occurred when trying to close the Operator running your custom script.
hive> create table kv_data(line string); OK Time taken: 0.935 seconds hive> load data local inpath '${env:HOME}/qhy/kv_data.txt' into table kv_data; Loading data to table default.kv_data Table default.kv_data stats: [numFiles=1, totalSize=49] OK Time taken: 1.174 seconds hive> select * from kv_data; OK k1=v1,k2=v2 k4=v4,k5=v5,k6=k6 k7=v7,k7=v7,k3=v7 Time taken: 0.155 seconds, Fetched: 4 row(s) hive> select transform(line) > using 'perl split_kv.pl' > as (key,value) > from kv_data; Query ID = hadoop_20181214115722_6bd27f59-f419-49f1-a1b0-1c33aefaf4f1 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1544754045260_0009, Tracking URL = http://master:8088/proxy/application_1544754045260_0009/ Kill Command = /home/hadoop/opt/software/hadoop-2.8.3/bin/hadoop job -kill job_1544754045260_0009 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2018-12-14 11:57:46,033 Stage-1 map = 0%, reduce = 0% 2018-12-14 11:58:16,628 Stage-1 map = 100%, reduce = 0% Ended Job = job_1544754045260_0009 with errors Error during job, obtaining debugging information... Examining task ID: task_1544754045260_0009_m_000000 (and more) from job job_1544754045260_0009 Task with the most failures(4): ----- Task ID: task_1544754045260_0009_m_000000 URL: http://master:8088/taskdetails.jsp?jobid=job_1544754045260_0009&tipid=task_1544754045260_0009_m_000000 ----- Diagnostic Messages for this Task: Error: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:210) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script. at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:560) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:631) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:192) ... 8 more FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred when trying to close the Operator running your custom script. MapReduce Jobs Launched: Stage-Stage-1: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec
原因是因為沒有把map指令碼新增到分散式快取中,因此會報錯 metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script. 這種錯誤 。PS:這裡的 路徑為本地路徑不是分散式HDFS 路徑。
只要在hive上補充執行add file即可,如圖所示:
hive> add file ${env:HOME}/qhy/split_kv.pl; Added resources: [/home/hadoop/qhy/split_kv.pl] hive> select transform(line) > using 'perl split_kv.pl' > as (key,value) > from kv_data; Query ID = hadoop_20181214142017_79c197f6-3b07-4b0b-9cad-f9f8be078cb9 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1544754045260_0012, Tracking URL = http://master:8088/proxy/application_1544754045260_0012/ Kill Command = /home/hadoop/opt/software/hadoop-2.8.3/bin/hadoop job -kill job_1544754045260_0012 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2018-12-14 14:20:36,236 Stage-1 map = 0%, reduce = 0% 2018-12-14 14:20:47,251 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.22 sec MapReduce Total cumulative CPU time: 1 seconds 220 msec Ended Job = job_1544754045260_0012 MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Cumulative CPU: 1.22 sec HDFS Read: 3527 HDFS Write: 48 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 220 msec OK k1 v1 k2 v2 k4 v4 k5 v5 k6 k6 k7 v7 k7 v7 k3 v7 Time taken: 30.884 seconds, Fetched: 8 row(s) hive>