讀取hdfs檔案內容匯入mysql(續)
阿新 • • 發佈:2019-01-24
現在想單獨的寫個類實現讀取hdfs檔案內容匯入mysql,也就是使用java api 來寫main方法那種形式來實現。
Configuration conf = new Configuration(true); conf.set("fs.default.name", "hdfs://<span style="font-family: Arial, Helvetica, sans-serif;">cluster2</span>"); conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem"); FileSystem fs = null;// try { fs = FileSystem.get(conf); } catch (Exception e) { LOG.error("getFileSystem failed :" + e.getMessage()); }
但是上述內容會報錯,java.net.UnknownHostException: hdfs://cluster2
至此,因為是hadoop yarn 2.2,所以根據 http://www.oschina.net/code/snippet_121248_34430 博文中的配置,增加了conf中的屬性。
修正如下
conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://cluster2"); conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem"); conf.set("ha.zookeeper.quorum", "xx:2181,xx:2181,xx:2181"); conf.set("dfs.nameservices", "cluster2"); conf.set("dfs.ha.namenodes.cluster2", "nn1,nn2"); conf.set("dfs.namenode.rpc-address.cluster2.nn1", "xx:8020"); conf.set("dfs.namenode.rpc-address.cluster2.nn2", "xx:8020"); conf.set("dfs.client.failover.proxy.provider.cluster2", "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"); conf.set("hadoop.security.authentication", "kerberos"); conf.set("yarn.resourcemanager.scheduler.address", "xx:8030");
錯誤提示終於變了,但是這個錯誤也沒解決。
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:534) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1681) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397) at com.netease.weblogOffline.exp.mysql.OrgMediaSQL.main(OrgMediaSQL.java:126) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS] at org.apache.hadoop.ipc.Client.call(Client.java:1347) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy7.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy8.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
能力有限,沒有找到有效的解決方案後,只能回到最初的方法來解決。
利用hadoop命令來跑一個空的任務,主要執行讀取hdfs檔案內容。
最好自己想了下,java -classpath這種形式組織的configuration中的屬性值肯定少於hadoop下的配置檔案中的屬性值。
還是老老實實的走hadoop吧,我釋然了。