1. 程式人生 > >深入淺出Netty:NioEventLoop

深入淺出Netty:NioEventLoop

本系列:

上一章節中,我們分析了Netty服務的啟動過程,本章節分析Netty的eventLoop是如工作的。

NioEventLoop中維護了一個執行緒,執行緒啟動時會呼叫NioEventLoop的run方法,執行I/O任務和非I/O任務。

  • I/O任務即selectionKey中ready的事件,如accept、connect、read、write等,由processSelectedKeysOptimized或processSelectedKeysPlain方法觸發。
  • 非IO任務則為新增到taskQueue中的任務,如register0、bind0等任務,由runAllTasks方法觸發。
  • 兩種任務的執行時間比由變數ioRatio控制,預設為50,則表示允許非IO任務執行的時間與IO任務的執行時間相等。

NioEventLoop.run 方法實現

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647 protectedvoidrun(){for(;;){booleanoldWakenUp=wakenUp.getAndSet(false);try{if(hasTasks()){selectNow();}else{select(oldWakenUp);if(wakenUp.get()){selector.wakeup();}}cancelledKeys
=0;needsToSelectAgain=false;finalintioRatio=this.ioRatio;if(ioRatio==100){processSelectedKeys();runAllTasks();}else{finallongioStartTime=System.nanoTime();processSelectedKeys();finallongioTime=System.nanoTime()-ioStartTime;runAllTasks(ioTime *(100-ioRatio)/ioRatio);}if(isShuttingDown()){closeAll();if(confirmShutdown()){break;}}}catch(Throwablet){logger.warn("Unexpected exception in the selector loop.",t);// Prevent possible consecutive immediate failures that lead to// excessive CPU consumption.try{Thread.sleep(1000);}catch(InterruptedExceptione){// Ignore.}}}}

hasTasks()方法判斷當前taskQueue是否有元素。
1、 如果taskQueue中有元素,執行 selectNow() 方法,最終執行selector.selectNow(),該方法會立即返回。

12345678910 voidselectNow()throwsIOException{try{selector.selectNow();}finally{// restore wakup state if neededif(wakenUp.get()){selector.wakeup();}}}

2、 如果taskQueue沒有元素,執行 select(oldWakenUp) 方法,程式碼如下:

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677 privatevoidselect(booleanoldWakenUp)throwsIOException{Selector selector=this.selector;try{intselectCnt=0;longcurrentTimeNanos=System.nanoTime();longselectDeadLineNanos=currentTimeNanos+delayNanos(currentTimeNanos);for(;;){longtimeoutMillis=(selectDeadLineNanos-currentTimeNanos+500000L)/1000000L;if(timeoutMillis<=0){if(selectCnt==0){selector.selectNow();selectCnt=1;}break;}intselectedKeys=selector.select(timeoutMillis);selectCnt++;if(selectedKeys!=0||oldWakenUp||wakenUp.get()||hasTasks()||hasScheduledTasks()){// - Selected something,// - waken up by user, or// - the task queue has a pending task.// - a scheduled task is ready for processingbreak;}if(Thread.interrupted()){// Thread was interrupted so reset selected keys and break so we not run into a busy loop.// As this is most likely a bug in the handler of the user or it's client library we will// also log it.//// See https://github.com/netty/netty/issues/2426if(logger.isDebugEnabled()){logger.debug("Selector.select() returned prematurely because "+"Thread.currentThread().interrupt() was called. Use "+"NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");}selectCnt=1;break;}longtime=System.nanoTime();if(time-TimeUnit.MILLISECONDS.toNanos(timeoutMillis)>=currentTimeNanos){// timeoutMillis elapsed without anything selected.selectCnt=1;}elseif(SELECTOR_AUTO_REBUILD_THRESHOLD>0&&selectCnt>=SELECTOR_AUTO_REBUILD_THRESHOLD){// The selector returned prematurely many times in a row.// Rebuild the selector to work around the problem.logger.warn("Selector.select() returned prematurely {} times in a row; rebuilding selector.",selectCnt);rebuildSelector();selector=this.selector;// Select again to populate selectedKeys.selector.selectNow();selectCnt=1;break;}currentTimeNanos=time;}if(selectCnt>MIN_PREMATURE_SELECTOR_RETURNS){if(logger.isDebugEnabled()){logger.debug("Selector.select() returned prematurely {} times in a row.",selectCnt-1);}}}catch(CancelledKeyExceptione){if(logger.isDebugEnabled()){logger.debug(CancelledKeyException.class.getSimpleName()+" raised by a Selector - JDK bug?",e);}// Harmless exception - log anyway}}

這個方法解決了Nio中臭名昭著的bug:selector的select方法導致cpu100%。
1、delayNanos(currentTimeNanos):計算延遲任務佇列中第一個任務的到期執行時間(即最晚還能延遲多長時間執行),預設返回1s。每個SingleThreadEventExecutor都持有一個延遲執行任務的優先佇列PriorityQueue,啟動執行緒時,往佇列中加入一個任務。

123456789101112131415 protectedlongdelayNanos(longcurrentTimeNanos){ScheduledFutureTask<?>delayedTask=delayedTaskQueue.peek();if(delayedTask==null){returnSCHEDULE_PURGE_INTERVAL;}returndelayedTask.delayNanos(currentTimeNanos);}//ScheduledFutureTask  publiclongdelayNanos(longcurrentTimeNanos){returnMath.max(0,deadlineNanos()-(currentTimeNanos-START_TIME));}publiclongdeadlineNanos(){returndeadlineNanos;}

2、如果延遲任務佇列中第一個任務的最晚還能延遲執行的時間小於500000納秒,且selectCnt == 0(selectCnt 用來記錄selector.select方法的執行次數和標識是否執行過selector.selectNow()),則執行selector.selectNow()方法並立即返回。
3、否則執行selector.select(timeoutMillis),這個方法已經在深入淺出NIO Socket分析過。
4、如果已經存在ready的selectionKey,或者selector被喚醒,或者taskQueue不為空,或則scheduledTaskQueue不為空,則退出迴圈。
5、如果 selectCnt 沒達到閾值SELECTOR_AUTO_REBUILD_THRESHOLD(預設512),則繼續進行for迴圈。其中 currentTimeNanos 在select操作之後會重新賦值當前時間,如果selector.select(timeoutMillis)行為真的阻塞了timeoutMillis,第二次的timeoutMillis肯定等於0,此時selectCnt 為1,所以會直接退出for迴圈。
6、如果觸發了epool cpu100%的bug,會發生什麼?
selector.select(timeoutMillis)操作會立即返回,不會阻塞timeoutMillis,導致 currentTimeNanos 幾乎不變,這種情況下,會反覆執行selector.select(timeoutMillis),變數selectCnt 會逐漸變大,當selectCnt 達到閾值,則執行rebuildSelector方法,進行selector重建,解決cpu佔用100%的bug。

1234567891011121314151617181920212223242526272829303132