1. 程式人生 > >sqoop從DB2遷移數據到HDFS

sqoop從DB2遷移數據到HDFS

rda byte edi apache auto lec $2 doc .sql

Sqoop import job failed to read data from DB2 database which has UTF8 encoding. Essentially, even the data cannot be read at DB2 with select queries as there are some characters which are not in UTF8.

Sqoop job will throw an error similar to below:

Error: java.io.IOException: SQLException in nextKeyValue
        at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:265)
..
..
Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][1065][12306][4.19.26] Caught java.io.CharConversionException.  See attached Throwable for details. ERRORCODE=-4220, SQLSTATE=null
        at com.ibm.db2.jcc.am.kd.a(Unknown Source)
        at com.ibm.db2.jcc.am.kd.a(Unknown Source)
..
..
Caused by: java.nio.charset.MalformedInputException: Input length = 527
        at com.ibm.db2.jcc.am.s.a(Unknown Source)
        ... 22 more
Caused by: sun.io.MalformedInputException
        at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:105)
        ... 23 more
2018-09-10 06:01:34,879 INFO mapreduce.Job: map 0% reduce 0% 2018-09-10 06:01:45,942 INFO mapreduce.Job: map 100% reduce 0% 2018-09-10 06:02:02,039 INFO mapreduce.Job: Task Id : attempt_1535965915754_0038_m_000000_2, Status : FAILED Error: java.io.IOException: SQLException in nextKeyValue at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1988) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: com.ibm.db2.jcc.am.SqlException: [jcc][t4][1065][12306][4.16.53] Caught java.io.CharConversionException. See attached Throwable for details. ERRORCODE=-4220, SQLSTATE=null at com.ibm.db2.jcc.am.fd.a(fd.java:723) at com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:112) at com.ibm.db2.jcc.am.jc.a(jc.java:2870) at com.ibm.db2.jcc.am.jc.p(jc.java:527) at com.ibm.db2.jcc.am.jc.N(jc.java:1563) at com.ibm.db2.jcc.am.ResultSet.getStringX(ResultSet.java:1153) at com.ibm.db2.jcc.am.ResultSet.getString(ResultSet.java:1128) at org.apache.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:71) at com.cloudera.sqoop.lib.JdbcWritableBridge.readString(JdbcWritableBridge.java:61) at PC_KPI_PC_INCIDENT_CFIUS_CONSTRAINED.readFields(PC_KPI_PC_INCIDENT_CFIUS_CONSTRAINED.java:197) at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244) ... 12 more Caused by: java.nio.charset.MalformedInputException: Input length = 574820 at com.ibm.db2.jcc.am.r.a(r.java:19) at com.ibm.db2.jcc.am.jc.a(jc.java:2862) ... 20 more Caused by: sun.io.MalformedInputException at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:167) at com.ibm.db2.jcc.am.r.a(r.java:16) ... 21 more

解決辦法:

需要在yarn的mapred-site.xml文件中添加如下配置:

<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024m -Ddb2.jcc.charsetDecoderEncoder=3</value>
</property>

http://www-01.ibm.com/support/docview.wss?uid=swg21684365

sqoop從DB2遷移數據到HDFS