線程池ThreadPool及Task調度機制分析

阿新 • • 發佈：2018-07-05

do it dequeue sts location object 資料 date async truct

近1年，偶爾發生應用系統啟動時某些操作超時的問題，特別在使用4核心Surface以後。筆記本和臺式機比較少遇到，服務器則基本上沒有遇到過。

這些年，我寫的應用都有一個習慣，就是啟動時異步做很多準備工作。基本上確定這個問題跟它們有關。

最近兩個月花了些時間分析線程池調度機制，有點繞，這裏記錄下來，防止以後忘了。

一、現象

這裏以一個典型WinForm應用來分析。開發環境Surface Pro4，CPU=4

在vs中調試應用，可以明顯感覺到啟動時會卡3~5秒，卡住時點下暫停。

通過調用棧發現程序死鎖了，調用邏輯偽代碼：Click=>6 * Task.Run(GetBill)=>Init=>GetConfig

業務邏輯，點擊按鈕Click，異步調用6次GetBill，每次都要Init判斷初始化，這裏有lock，拿到鎖的第一個線程GetConfig從配置中心拿配置數據。

線程窗口，5個線程卡在 Init 的lock那裏，1個線程通過Init進入GetConfig。

GetConfig內部通過HttpClient異步請求數據，用了 task.Wait(5000)，這裏也卡住了。

技術分享圖片

就這樣，6個線程死在這，一動不動的。

通過網絡抓包發現，Http的請求早就返回來了，根本不需要等5000ms。

查看任務窗口，大量“已阻止”和“已計劃”，兩個“等待”，然後大家都不動，這就是死鎖了。

技術分享圖片

從任務調度層面來猜測，應該是Task調度隊列擁擠，導致HttpClient異步請求完成以後，沒有辦法安排線程去同時task.Wait(5000)退出。

Task調度一直覺得很復雜，不好深入分析。

二、線程池

剛開始以為是大量使用Task.Run所致，大部分改為ThreadPool.QueueUserWorkItem以後，堵塞有所減少，但還是存在。

ILSpy打開ThreadPool發現，它也變得復雜了，不再是.Net2.0時代那個單純的小夥子。

時間優先，上個月寫了個線程池ThreadPoolX，並行隊列管理線程，每次排隊任務委托，就拿一個出來用，用完後還回去。

源碼如下：https://github.com/NewLifeX/X/blob/master/NewLife.Core/Threading/ThreadPoolX.cs

更新到上面這個WinForm應用，死鎖問題立馬解決。

ThreadPoolX非常簡單，所有異步任務都有平等獲取線程的機會，不存在說前面的線程卡住了，後面線程就沒有機會執行。

盡管利用率低一些，但是可以輕易避免這種死鎖的發生。

因此，可以確定是因為Task調度和ThreadPoll調度裏面的某種智能化機制，加上程序裏可能不合理的使用，導致了死鎖的發生！

三、深入分析

上個月雖然解決了問題，但沒有搞清楚內部機制，總是睡不好。最近晚上有時間查了各種資料，以及分析了源碼。

Task/TPL默認都是調用ThreadPool來執行任務，我們就以最常用的Task.Run作為切入點來分析。

/// <summary>
/// Queues the specified work to run on the ThreadPool and returns a Task handle for that work.
/// </summary>
/// <param name="action">The work to execute asynchronously</param>
/// <returns>A Task that represents the work queued to execute in the ThreadPool.</returns>
/// <exception cref="T:System.ArgumentNullException">
/// The <paramref name="action"/> parameter was null.
/// </exception>
public static Task Run(Action action)
{
    return Task.InternalStartNew(null, action, null, default(CancellationToken), TaskScheduler.Default,
        TaskCreationOptions.DenyChildAttach, InternalTaskOptions.None);
}

Task.Run內部使用了默認調度器，另一個關註點就是 DenyChildAttach了，阻止其它任務作為當前任務的子任務。

// Implicitly converts action to object and handles the meat of the StartNew() logic.
internal static Task InternalStartNew(
    Task creatingTask, Delegate action, object state, CancellationToken cancellationToken, TaskScheduler scheduler,
    TaskCreationOptions options, InternalTaskOptions internalOptions)
{
    // Validate arguments.
    if (scheduler == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.scheduler);
    }
    Contract.EndContractBlock();

    // Create and schedule the task. This throws an InvalidOperationException if already shut down.
    // Here we add the InternalTaskOptions.QueuedByRuntime to the internalOptions, so that TaskConstructorCore can skip the cancellation token registration
    Task t = new Task(action, state, creatingTask, cancellationToken, options, internalOptions | InternalTaskOptions.QueuedByRuntime, scheduler);

    t.ScheduleAndStart(false);
    return t;
}

InternalStartNew內部實例化一個Task對象，然後調用ScheduleAndStart，加入調度器並且啟動

/// <summary>
/// Schedules the task for execution.
/// </summary>
/// <param name="needsProtection">If true, TASK_STATE_STARTED bit is turned on in
/// an atomic fashion, making sure that TASK_STATE_CANCELED does not get set
/// underneath us.  If false, TASK_STATE_STARTED bit is OR-ed right in.  This
/// allows us to streamline things a bit for StartNew(), where competing cancellations
/// are not a problem.</param>
internal void ScheduleAndStart(bool needsProtection)
{
    Debug.Assert(m_taskScheduler != null, "expected a task scheduler to have been selected");
    Debug.Assert((m_stateFlags & TASK_STATE_STARTED) == 0, "task has already started");

    // Set the TASK_STATE_STARTED bit
    if (needsProtection)
    {
        if (!MarkStarted())
        {
            // A cancel has snuck in before we could get started.  Quietly exit.
            return;
        }
    }
    else
    {
        m_stateFlags |= TASK_STATE_STARTED;
    }

    if (s_asyncDebuggingEnabled)
    {
        AddToActiveTasks(this);
    }

    if (AsyncCausalityTracer.LoggingOn && (Options & (TaskCreationOptions)InternalTaskOptions.ContinuationTask) == 0)
    {
        //For all other task than TaskContinuations we want to log. TaskContinuations log in their constructor
        AsyncCausalityTracer.TraceOperationCreation(CausalityTraceLevel.Required, this.Id, "Task: " + m_action.Method.Name, 0);
    }


    try
    {
        // Queue to the indicated scheduler.
        m_taskScheduler.InternalQueueTask(this);
    }
    catch (ThreadAbortException tae)
    {
        AddException(tae);
        FinishThreadAbortedTask(delegateRan: false);
    }
    catch (Exception e)
    {
        // The scheduler had a problem queueing this task.  Record the exception, leaving this task in
        // a Faulted state.
        TaskSchedulerException tse = new TaskSchedulerException(e);
        AddException(tse);
        Finish(false);

        // Now we need to mark ourselves as "handled" to avoid crashing the finalizer thread if we are called from StartNew(),
        // because the exception is either propagated outside directly, or added to an enclosing parent. However we won‘t do this for
        // continuation tasks, because in that case we internally eat the exception and therefore we need to make sure the user does
        // later observe it explicitly or see it on the finalizer.

        if ((Options & (TaskCreationOptions)InternalTaskOptions.ContinuationTask) == 0)
        {
            // m_contingentProperties.m_exceptionsHolder *should* already exist after AddException()
            Debug.Assert(
                (m_contingentProperties != null) &&
                (m_contingentProperties.m_exceptionsHolder != null) &&
                (m_contingentProperties.m_exceptionsHolder.ContainsFaultList),
                    "Task.ScheduleAndStart(): Expected m_contingentProperties.m_exceptionsHolder to exist " +
                    "and to have faults recorded.");

            m_contingentProperties.m_exceptionsHolder.MarkAsHandled(false);
        }
        // re-throw the exception wrapped as a TaskSchedulerException.
        throw tse;
    }
}

準備了很多工作，最終還是為了加入調度器m_taskScheduler.InternalQueueTask(this)

protected internal override void QueueTask(Task task)
{
    if ((task.Options & TaskCreationOptions.LongRunning) != 0)
    {
        Thread thread = new Thread(s_longRunningThreadWork);
        thread.IsBackground = true;
        thread.Start(task);
    }
    else
    {
        bool forceGlobal = (task.Options & TaskCreationOptions.PreferFairness) != TaskCreationOptions.None;
        ThreadPool.UnsafeQueueCustomWorkItem(task, forceGlobal);
    }
}

線程池任務調度器ThreadPoolTaskScheduler的QueueTask是重點。

首先是LongRunning標識，直接開了個新線程，很粗暴很直接。

其次是PreferFairness標識，公平，forceGlobal，這個應該就是導致死鎖的根本。

public void Enqueue(IThreadPoolWorkItem callback, bool forceGlobal)
{
    if (loggingEnabled)
        System.Diagnostics.Tracing.FrameworkEventSource.Log.ThreadPoolEnqueueWorkObject(callback);

    ThreadPoolWorkQueueThreadLocals tl = null;
    if (!forceGlobal)
        tl = ThreadPoolWorkQueueThreadLocals.threadLocals;

    if (null != tl)
    {
        tl.workStealingQueue.LocalPush(callback);
    }
    else
    {
        workItems.Enqueue(callback);
    }

    EnsureThreadRequested();
}

未打開全局且有本地隊列時，放入本地隊列threadLocals，否則加入全局隊列workItems。

正式化這個本地隊列的優化機制，導致了我們的死鎖。

如果應用層直接調用 ThreadPool.QueueUserWorkItem ，都是 forceGlobal=true，也就都是全局隊列。

這也說明了為什麽我們把部分Task.Run改為ThreadPool.QueueUserItem後，情況有所改觀。

internal void EnsureThreadRequested()
{
    //
    // If we have not yet requested #procs threads from the VM, then request a new thread.
    // Note that there is a separate count in the VM which will also be incremented in this case, 
    // which is handled by RequestWorkerThread.
    //
    int count = numOutstandingThreadRequests;
    while (count < ThreadPoolGlobals.processorCount)
    {
        int prev = Interlocked.CompareExchange(ref numOutstandingThreadRequests, count + 1, count);
        if (prev == count)
        {
            ThreadPool.RequestWorkerThread();
            break;
        }
        count = prev;
    }
}

上面把任務放入隊列後，通過QCall調用了EnsureThreadRequested，此時豁然開朗！

原來，這裏才是真正的申請線程池來處理隊列裏面的任務，並且最大線程數就是處理器個數！

我們可以寫個簡單程序來驗證一下：

Console.WriteLine("CPU={0}", Environment.ProcessorCount);
for (var i = 0; i < 10; i++)
{
    ThreadPool.QueueUserWorkItem(s =>
    {
        var n = (Int32)s;
        Console.WriteLine("{0:HH:mm:ss.fff} th {1} start", DateTime.Now, n);
        Thread.Sleep(2000);
        Console.WriteLine("{0:HH:mm:ss.fff} th {1} end", DateTime.Now, n);
    }, i);
}

CPU=4
18:05:27.936 th 2 start
18:05:27.936 th 3 start
18:05:27.936 th 1 start
18:05:27.936 th 0 start
18:05:29.373 th 4 start
18:05:29.939 th 2 end
18:05:29.940 th 5 start
18:05:29.940 th 0 end
18:05:29.941 th 6 start
18:05:29.940 th 1 end
18:05:29.940 th 3 end
18:05:29.942 th 7 start
18:05:29.942 th 8 start
18:05:30.871 th 9 start
18:05:31.374 th 4 end
18:05:31.942 th 5 end
18:05:31.942 th 6 end
18:05:31.943 th 7 end
18:05:31.943 th 8 end
18:05:32.872 th 9 end

在我的4核心CPU上執行，27.936先調度了4個任務，然後1秒多之後再調度第5個任務，其它任務則是等前面4個任務完成以後才有機會。

第5個任務能夠在前4個完成之前得到調度，可能跟Sleep有關，這是內部機制了。

目前可以肯定的是，ThreadPool空有1000個最大線程數，但實際上只能用略大於CPU個數的線程！（CPU+1 ？）

當然，它內部應該有其它機制來增加線程調度，比如Sleep。

最後是調度Dispatch

internal static bool Dispatch()
{
    var workQueue = ThreadPoolGlobals.workQueue;
    //
    // The clock is ticking!  We have ThreadPoolGlobals.TP_QUANTUM milliseconds to get some work done, and then
    // we need to return to the VM.
    //
    int quantumStartTime = Environment.TickCount;

    //
    // Update our records to indicate that an outstanding request for a thread has now been fulfilled.
    // From this point on, we are responsible for requesting another thread if we stop working for any
    // reason, and we believe there might still be work in the queue.
    //
    // Note that if this thread is aborted before we get a chance to request another one, the VM will
    // record a thread request on our behalf.  So we don‘t need to worry about getting aborted right here.
    //
    workQueue.MarkThreadRequestSatisfied();

    // Has the desire for logging changed since the last time we entered?
    workQueue.loggingEnabled = FrameworkEventSource.Log.IsEnabled(EventLevel.Verbose, FrameworkEventSource.Keywords.ThreadPool | FrameworkEventSource.Keywords.ThreadTransfer);

    //
    // Assume that we‘re going to need another thread if this one returns to the VM.  We‘ll set this to 
    // false later, but only if we‘re absolutely certain that the queue is empty.
    //
    bool needAnotherThread = true;
    IThreadPoolWorkItem workItem = null;
    try
    {
        //
        // Set up our thread-local data
        //
        ThreadPoolWorkQueueThreadLocals tl = workQueue.EnsureCurrentThreadHasQueue();

        //
        // Loop until our quantum expires.
        //
        while ((Environment.TickCount - quantumStartTime) < ThreadPoolGlobals.TP_QUANTUM)
        {
            bool missedSteal = false;
            workItem = workQueue.Dequeue(tl, ref missedSteal);

            if (workItem == null)
            {
                //
                // No work.  We‘re going to return to the VM once we leave this protected region.
                // If we missed a steal, though, there may be more work in the queue.
                // Instead of looping around and trying again, we‘ll just request another thread.  This way
                // we won‘t starve other AppDomains while we spin trying to get locks, and hopefully the thread
                // that owns the contended work-stealing queue will pick up its own workitems in the meantime, 
                // which will be more efficient than this thread doing it anyway.
                //
                needAnotherThread = missedSteal;

                // Tell the VM we‘re returning normally, not because Hill Climbing asked us to return.
                return true;
            }

            if (workQueue.loggingEnabled)
                System.Diagnostics.Tracing.FrameworkEventSource.Log.ThreadPoolDequeueWorkObject(workItem);

            //
            // If we found work, there may be more work.  Ask for another thread so that the other work can be processed
            // in parallel.  Note that this will only ask for a max of #procs threads, so it‘s safe to call it for every dequeue.
            //
            workQueue.EnsureThreadRequested();

            //
            // Execute the workitem outside of any finally blocks, so that it can be aborted if needed.
            //
            if (ThreadPoolGlobals.enableWorkerTracking)
            {
                bool reportedStatus = false;
                try
                {
                    ThreadPool.ReportThreadStatus(isWorking: true);
                    reportedStatus = true;
                    workItem.ExecuteWorkItem();
                }
                finally
                {
                    if (reportedStatus)
                        ThreadPool.ReportThreadStatus(isWorking: false);
                }
            }
            else
            {
                workItem.ExecuteWorkItem();
            }
            workItem = null;

            // 
            // Notify the VM that we executed this workitem.  This is also our opportunity to ask whether Hill Climbing wants
            // us to return the thread to the pool or not.
            //
            if (!ThreadPool.NotifyWorkItemComplete())
                return false;
        }

        // If we get here, it‘s because our quantum expired.  Tell the VM we‘re returning normally.
        return true;
    }
    catch (ThreadAbortException tae)
    {
        //
        // This is here to catch the case where this thread is aborted between the time we exit the finally block in the dispatch
        // loop, and the time we execute the work item.  QueueUserWorkItemCallback uses this to update its accounting of whether
        // it was executed or not (in debug builds only).  Task uses this to communicate the ThreadAbortException to anyone
        // who waits for the task to complete.
        //
        workItem?.MarkAborted(tae);

        //
        // In this case, the VM is going to request another thread on our behalf.  No need to do it twice.
        //
        needAnotherThread = false;
        // throw;  //no need to explicitly rethrow a ThreadAbortException, and doing so causes allocations on amd64.
    }
    finally
    {
        //
        // If we are exiting for any reason other than that the queue is definitely empty, ask for another
        // thread to pick up where we left off.
        //
        if (needAnotherThread)
            workQueue.EnsureThreadRequested();
    }

    // we can never reach this point, but the C# compiler doesn‘t know that, because it doesn‘t know the ThreadAbortException will be reraised above.
    Debug.Fail("Should never reach this point");
    return true;
}

public IThreadPoolWorkItem Dequeue(ThreadPoolWorkQueueThreadLocals tl, ref bool missedSteal)
{
    WorkStealingQueue localWsq = tl.workStealingQueue;
    IThreadPoolWorkItem callback;

    if ((callback = localWsq.LocalPop()) == null && // first try the local queue
        !workItems.TryDequeue(out callback)) // then try the global queue
    {
        // finally try to steal from another thread‘s local queue
        WorkStealingQueue[] queues = WorkStealingQueueList.Queues;
        int c = queues.Length;
        Debug.Assert(c > 0, "There must at least be a queue for this thread.");
        int maxIndex = c - 1;
        int i = tl.random.Next(c);
        while (c > 0)
        {
            i = (i < maxIndex) ? i + 1 : 0;
            WorkStealingQueue otherQueue = queues[i];
            if (otherQueue != localWsq && otherQueue.CanSteal)
            {
                callback = otherQueue.TrySteal(ref missedSteal);
                if (callback != null)
                {
                    break;
                }
            }
            c--;
        }
    }

    return callback;
}

這個Dispatch應該是由內部借出來的線程池線程調用，有點意思：

一次Dispatch處理多個任務，只要總耗時不超過30個滴答，這樣可以減少線程切換
每次從隊列拿一個任務來處理，然後檢查打開更多線程（如果不足CPU數）
先從本地隊列彈出任務，然後到全局隊列，最後再從其它線程的本地隊列隨機偷一個
本地隊列是壓棧彈棧FILO，也就是先進來的任務後執行

這裏最復雜的就是本地隊列FILO結構，這也是專門為Task而設計。

End.

線程池ThreadPool及Task調度機制分析

do it dequeue sts location object 資料 date async truct 近1年，偶爾發生應用系統啟動時某些操作超時的問題，特別在使用4核心Surface以後。筆記本和臺式機比較少遇到，服務器則基本上沒有遇到過。這些年，我寫的應用都有一

線程池原理及實現

任務隊列批量 not alt con 成了代碼 pla extends 1、線程池簡介：多線程技術主要解決處理器單元內多個線程執行的問題，它可以顯著減少處理器單元的閑置時間，增加處理器單元的吞吐能力。假設一個服務器完成一項任務所需時間為：T1

C# 線程池ThreadPool的用法簡析

可見 https sdn 而是 plain call 計時器最大線程數 water https://blog.csdn.net/smooth_tailor/article/details/52460566 什麽是線程池？為什麽要用線程池？怎麽用線程池？ 1. 什

完全解析線程池ThreadPool原理&使用

ati maximum adf 斷線 sched 源碼調用 executor 線程池目錄 1. 簡介 2. 工作原理 2.1 核心參數線程池中有6個核心參數，具體如下上述6個參數的配置決定了線程池的功能，具體設置時機 = 創建線程池類對

線程池原理及python實現

source 實例以及代碼 let range python實現 queue 上界 https://www.cnblogs.com/goodhacker/p/3359985.html 為什麽需要線程池　　目前的大多數網絡服務器，包括Web服務器、Email服務器以

（四）juc線程高級特性——線程池 / 線程調度 / ForkJoinPool

() calculate 選擇程序 color 區別 sts ash 工廠 13. 線程池第四種獲取線程的方法：線程池，一個 ExecutorService，它使用可能的幾個池線程之一執行每個提交的任務，通常使用 Executors 工廠方法配置。線程池可以解決

Java並發編程（7）- 線程調度 - 線程池

car 並且執行創建線程 atomic 介紹狀態 edr 關閉 imp 線程池平時有接觸過多線程開發的小夥伴們應該都或多或少都有了解、使用過線程池，而《阿裏巴巴 Java 手冊》裏也有一條規範：由此可見線程池的重要性，線程池對於限制應用程序中同一時刻運行的線程數很有

線程池--spring配置，靜態上下文獲取以及調用

spring@ImportResource({"classpath:dubbo.xml","classpath*:applicationContext.xml"})定義applicationContext.xml<?xml version="1.0" encoding="UTF-8"?><b

線程池的原理及實現

execute inter void date() 超過緩沖線程池大小 exceptio 調整 1、線程池簡介：多線程技術主要解決處理器單元內多個線程執行的問題，它可以顯著減少處理器單元的閑置時間，增加處理器單元的吞吐能力。假設一個服務器完成一項任務所需時間為：T1

線程池的工作原理及使用示例

影響 cti run padding 返回值 read mina 容器原理 . 為什麽要使用線程池？我們現在考慮最簡單的服務器工作模型：服務器每當接收到一個客戶端請求時就創建一個線程為其服務。這種模式理論上可以工作的很好，但實際上會存在一些缺陷，服務器應用程序中經常出

線程池的使用及ThreadPoolExecutor的execute和addWorker源碼分析

單位 bool 集合 handler 意思 size 執行順序 targe execute 說明：本作者是文章的原創作者，轉載請註明出處：本文地址：http://www.cnblogs.com/qm-article/p/7821602.html 一、線程池的介紹

多線程狀態及線程池管理

exe i/o wait 存儲 auto alt 周期 queue imu 一. 線程狀態類型 1. 新建狀態（New）：新創建了一個線程對象。 2. 就緒狀態（Runnable）：線程對象創建後，其他線程調用了該對象的start()方法。該狀態的線程位於可運行線程池中，變

C#多線程編程（1）--線程，線程池和Task

gpo 第一次 span via 任務隊列返回值異步如果是你　　新開了一個多線程編程系列，該系列主要講解C#中的多線程編程。　利用多線程的目的有2個: 一是防止UI線程被耗時的程序占用，導致界面卡頓；二是能夠利用多核CPU的資源，提高運行效率。　　我沒有

並發編程 - 線程 - 1.線程queue/2.線程池進程池/3.異步調用與回調機制

cal 編程機制 com size ssp .org don 結果 1.線程queue :會有鎖 q=queue.Queue(3) q.get() q.put()先進先出隊列後進先出堆棧優先級隊列 1 """先進先出隊列""" 2 impor

並發編程---線程queue---進程池線程池---異部調用(回調機制)

priority close name port rand current 結果耦合 join 線程隊列：先進先出堆棧：後進先出優先級：數字越小優先級越大，越先輸出 import queue q = queue.Queue(3) # 先進先出-->隊

第一次作業：Linux的進程模型及CFS調度器算法分析

並行執行 pick 資源 virt 時鐘程序代碼 processor 關系 mov 1. 關於進程 1.1. 進程的定義進程：指在系統中能獨立運行並作為資源分配的基本單位，它是由一組機器指令、數據和堆棧等組成的，是一個能獨立運行的活動實體。進程是程序的一次執行

淺談線程池（中）：獨立線程池的作用及IO線程池

關於線程數客戶端 pool 網絡程序服務器缺點 public 在上一篇文章中，我們簡單討論了線程池的作用，以及CLR線程池的一些特性。不過關於線程池的基本概念還沒有結束，這次我們再來補充一些必要的信息，有助於我們在程序中選擇合適的使用方式。獨立線程池上次我們討

淺談線程池（下）：相關試驗及註意事項

DG 執行 html ble DC resp lin 不足 4.0 三個月，整整三個月了，我忽然發現我還有三個月前的一個小系列的文章沒有結束，我還欠一個試驗！線程池是.NET中的重要組件，幾乎所有的異步功能依賴於線程池。之前我們討論了線程池的作用、獨立線程池的存在意義，以及

線程池與並行度

資源 start 創建 sta tel span nds sys 不同的本節將展示線程池如何工作於大量的異步操作，以及它與創建大量單獨的線程的方式有和不同。代碼Demo： using System;using System.Threading;using System.

線程池（ThreadPool）

方案分配 ati 通過計算工作原理結果中大 ntb 線程池（ThreadPool） https://www.cnblogs.com/jonins/p/9369927.html 線程池概述由系統維護的容納線程的容器，由CLR控制的所有AppDomain共享。線程池

線程池ThreadPool及Task調度機制分析

一、現象

二、線程池

三、深入分析

相關推薦