1. 程式人生 > >kylin報錯及解決方案總結

kylin報錯及解決方案總結

一、在build cube這一步中報錯:Value not exists!

查詢該步的mr日誌,提示 Not a valid value:2017-05-31,有兩種可能

1.該錯誤是由於build過程中,所引用的維表資料發生了變化,使用該值查詢維表,維表中不存在這條資料。

2.olap表關聯了維表,但只使用了關聯欄位,如果olap表的code在維表裡不存在,則會報錯

    解決:

         1、確定維表中是否存在該值。

         2、確定維表中為什麼不存在。

         3、該值在olap中是否合理。

         4.如果是 olap表關聯了維表,但只使用了關聯欄位,如果olap表的code在維表裡不存在,則會報錯

  導致報錯 Value not exists!

              可以設定 true  這樣可以強制kylin關聯dim表 過濾掉olap在維表不存在的值

              但如果維表本身有問題(資料不全或者為空)會導致olap的資料被過濾 請根據場景設定

               設定後檢查sql是否符合預期(如圖)

              

二、.build 第三步Extract Fact Table Distinct報錯 :ArrayIndexOutOfBoundsException: -1 at  (資料溢位) 

原因:olap中有與維表同名的欄位或者維表之間有同名的欄位(或者模型有問題,join的欄位,維度要選擇事實表字段,已解決)

解決:去掉同名欄位

三、build第四步Build Dimension Dictionary

Failed to create dictionary on OLAP.OLAP_CUSTOMER_STOCK_2_DA.CUST_ID_STOCK

java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_CUSTOMER_STOCK_2_DA.CUST_ID_STOCK

	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325)
	at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File does not exist: /kylin/kylin-kylin_metadata/resources/GlobalDict/dict/OLAP.OLAP_CUSTOMER_STOCK_2_DA/CUST_ID_STOCK/.index
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1828)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

	at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:127)
	at org.apache.kylin.dict.DictionaryManager.getDictionary(DictionaryManager.java:114)
	at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:65)
	at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323)
	... 14 more
或者以下日誌

出現原因是該欄位配置了全域性字典,如果同時提交多個segment構建任務,並且Build Dimension Dictionary這步正好同時執行到,會導致多工操作同一個字典檔案,導致異常

全域性字典任務儘量不要並行構建,出現問題後,resume任務或者重新提交build(目前已經增加分散式鎖,可以並行構建,應該不會有這種報錯)

四、

使用全域性時,構建cube的第四步Build Dimension Dictionary出錯。

報錯資訊如下所示,其中OLAP.OLAP_LOG_WEB_TS_DI.ORIGINAL_SESSION_ID 是設定為全域性字典的欄位。

java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_LOG_WEB_TS_DI.ORIGINAL_SESSION_ID
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325)
	at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException
	at org.apache.kylin.dict.CachedTreeMap.writeValue(CachedTreeMap.java:240)
	at org.apache.kylin.dict.CachedTreeMap.write(CachedTreeMap.java:374)
	at org.apache.kylin.dict.AppendTrieDictionary.flushIndex(AppendTrieDictionary.java:1043)
	at org.apache.kylin.dict.AppendTrieDictionary$Builder.build(AppendTrieDictionary.java:954)
	at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:82)
	at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323)
	... 14 more

result code:2

出錯原因:使用全域性字典有容量的限制,Count distinct指標欄位的字串長度不能超過255。建議檢查出錯欄位的原始資料長度。

四-2:

使用全域性時,構建cube的第四步Build Dimension Dictionary出錯。

目前為偶發異常 ,discard後重新提交

四-3:

構建cube的第四步Build Dimension Dictionary出錯。

java.lang.RuntimeException: Failed to create dictionary on OLAP.OLAP_MKT_LOG_ACCESS_PAGE_INDICATOR_DI.USER_ID
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325)
	at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:185)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: java.io.EOFException
	at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:127)
	at org.apache.kylin.dict.DictionaryManager.getDictionary(DictionaryManager.java:114)
	at org.apache.kylin.dict.AppendTrieDictionary$Builder.createNewBuilder(AppendTrieDictionary.java:884)
	at org.apache.kylin.dict.AppendTrieDictionary$Builder.getInstance(AppendTrieDictionary.java:844)
	at org.apache.kylin.dict.AppendTrieDictionary$Builder.getInstance(AppendTrieDictionary.java:838)
	at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:65)
	at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)
	at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323)
	... 14 more
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:392)
	at org.apache.kylin.dict.AppendTrieDictionary.readFields(AppendTrieDictionary.java:1238)
	at org.apache.kylin.dict.DictionaryInfoSerializer.deserialize(DictionaryInfoSerializer.java:74)
	at org.apache.kylin.dict.DictionaryInfoSerializer.deserialize(DictionaryInfoSerializer.java:34)
	at org.apache.kylin.common.persistence.ResourceStore.getResource(ResourceStore.java:146)
	at org.apache.kylin.dict.DictionaryManager.load(DictionaryManager.java:421)
	at org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:103)
	at org.apache.kylin.dict.DictionaryManager$1.load(DictionaryManager.java:100)
	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257)
	at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
	at org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:120)
	... 21 more

聯絡管理員處理)

1.查詢全域性字典的路徑,找出.index大小為0的目錄,刪除或者mv

五、

GlobalDict /dict/OLAP.OLAP_CUSTOMER_NEW_3_DI/CUST_PHONE1_ENCRYPTED should have 0 or 1 append dict but 2

聯絡管理員處理)

1.從cubedesc中找出不一樣的全域性字典

2.刪除這些字典以及對應的segment

補充說明:

1.上述處理方法可能會導致字典異常,造成count distinct不準,最好清空整個cube 執行metastore.sh clean後重跑資料

六、

hbase問題導致,取消任務,重試即可

七:重新整理任務列表失敗

異常為:

java.lang.NullPointerException
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
at org.apache.kylin.rest.service.JobService.parseToJobStep(JobService.java:309)
at org.apache.kylin.rest.service.JobService.parseToJobInstance(JobService.java:303)
at org.apache.kylin.rest.service.JobService.access$000(JobService.java:73)
at org.apache.kylin.rest.service.JobService$1.apply(JobService.java:134)
at org.apache.kylin.rest.service.JobService$1.apply(JobService.java:131)
at com.google.common.collect.Iterators$8.transform(Iterators.java:860)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at com.google.common.collect.Lists.newArrayList(Lists.java:145)
at com.google.common.collect.Lists.newArrayList(Lists.java:125)
at org.apache.kylin.rest.service.JobService.listCubeJobInstance(JobService.java:131)
at org.apache.kylin.rest.service.JobService.listAllJobs(JobService.java:103)
at org.apache.kylin.rest.service.JobService.listAllJobs(JobService.java:84)
at org.apache.kylin.rest.service.JobService$$FastClassBySpringCGLIB$$83a44b2a.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:629)
at org.apache.kylin.rest.service.JobService$$EnhancerBySpringCGLIB$$29ce7197.listAllJobs(<generated>)
at org.apache.kylin.rest.controller.JobController.list(JobController.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)

處理方法:

去server05上重新整理任務列表,檢視kylin.out最後停住的記錄

使用./metastore.sh remove "/execute/${id}”刪除該記錄

八:hbase建表衝突

1) discard任務

2)呼叫刪除segment api

3)去hbase中刪除該表

九:

有全域性字典,並且Build Base Cuboid Data時間過長

檢視mr的counter發現gc時間較長

1) 利用adhoc查詢設定全域性字典的列的distinct大小

2)增加cube設定kylin.job.mr.config.override.mapred.map.child.java.opts=-Xmx8g kylin.job.mr.config.override.mapreduce.map.memory.mb=8500

如果記憶體還不夠可改為(可以優先調整map大小,即前兩項,如還不行,四項均設定)

kylin.job.mr.config.override.mapreduce.map.java.opts=-Xmx15g kylin.job.mr.config.override.mapreduce.map.memory.mb=16000

kylin.job.mr.config.override.mapreduce.reduce.java.opts=-Xmx15g kylin.job.mr.config.override.mapreduce.reduce.memory.mb=16000

十、第三步

問題:

java.lang.IllegalStateException
	at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:98)
	at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)
	at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

解決:reload metadata

十一:查詢報錯Not a valid ID

可能原因1:修改cube 增加維度時會造成cube元資料不同步

當修改一個cube增加新的維度欄位後,cube build能成功完成。但是當查詢語句中包含該新增加的維度時,會報如下錯誤:Not a valid ID。該維度並未包含在cube的元資料中。所以在使用kylin的過程中,應儘量避免在cube上做修改,建議新建cube或者clone cube後進行修改。

可能原因2:cube設定了自動合併

kylin的自動合併有bug,建議關閉,查詢有問題的時間段可以通過重新build修復 

十二:Build Dimension Dictionary報錯

java.lang.IllegalStateException: Dup key found, key=[0], value1=[0,未知], value2=[0,null]
	at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:85)
	at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:68)
	at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
	at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:56)
	at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
	at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:674)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:60)
	at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
	at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
	at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
	at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

原因:維表主鍵有重複,0在維表出現兩次,分別是[0,未知],[0,null]

十三:

load kylin中已有的olap表出錯

原因是存在desbroken狀態的cube導致  drop掉以即可

十四:

build第三步報錯

原因:幾臺build機器之間的配置不同步,比如

kylin.cube.aggrgroup.max.combination的值的設定不一樣

 十五:

現象:設定了全域性字典的指標build完之後,資料與hive中查詢的不一致

解決:全域性字典問題,清空資料後,刪除該字典,重新build即可

十六:

mr任務失敗提示 Error: GC overhead limit exceeded,原因是mr記憶體不夠

可增加cube設定kylin.job.mr.config.override.mapred.map.child.java.opts=-Xmx8g kylin.job.mr.config.override.mapreduce.map.memory.mb=8500

如果記憶體還不夠可改為(可以優先調整map大小,即前兩項,如還不行,四項均設定)

kylin.job.mr.config.override.mapreduce.map.java.opts=-Xmx15g

kylin.job.mr.config.override.mapreduce.map.memory.mb=16000

kylin.job.mr.config.override.mapreduce.reduce.java.opts=-Xmx15g

 kylin.job.mr.config.override.mapreduce.reduce.memory.mb=16000