1. 程式人生 > >系統技術非業餘研究 » Erlang 網路密集型伺服器的瓶頸和解決思路

系統技術非業餘研究 » Erlang 網路密集型伺服器的瓶頸和解決思路

最近我們的Erlang IO密集型的伺服器程式要做細緻的效能提升,從每秒40萬包處理提升到60萬目標,需要對程序和IO排程器的原理很熟悉,並且對行為進行微調,花了不少時間參閱了相關的文件和程式碼。

其中最有價值的二篇文章是:
1. Characterizing the Scalability of Erlang VM on Many-core Processors 參見這裡
2. Evaluate the benefits of SMP support for IO-intensive Erlang applications 參見這裡

我們的效能瓶頸目前根據 lcnt 的提示:

1. 排程器執行佇列的鎖衝突,參見下圖:

lcnt_rq_conflict

2. erlang只有單個poll set, 大量的IO導致效能瓶頸,摘抄“Evaluate the benefits of SMP support for IO-intensive Erlang applications” P46的結論如下:

Finally, we analyzed how IO operations are handled by the Erlang VM, and that
was the bottleneck. The problem relies on the fact that there is only one global
poll-set where IO tasks are kept. Hence, only one scheduler at a time can call
erts_check_io (the responsible function for performing IO tasks) to obtain pending
tasks from the poll-set. So, a scheduler can finish its job, but it has to wait
idly for the other schedulers to complete their IO pending tasks before it starts
it own ones. In more details, for N scheduler, only one can call erts_check_io
regardless the load; the other N-1 schedulers will start spinning until they gain
access to erts_check_io() and finish executing their IO tasks. For a bigger number of
schedulers, more schedulers will spin, and more CPU time will be waisted on spinning.
This behavior was noticed even during our evaluations when running the tests in a 8
cores machine, apart from the 16 cores one. There are two conditions that determine
whether a scheduler can access erts_check_io, one is for a mutex variable named
“doing_sys_schedule” to be unlocked, and the other one is to make sure the variable
“erts_port_task_outstanding_io_tasks” reaches a value of 0 meaning that there is
no IO task to process in the whole system. If one of the conditions breaks and there
is no other task to process, the scheduler starts spinning. Emysql driver generates
a lot of processes to be executed by the schedulers since a new process is spawned
per each single insert or read request. By using Erlang etop tool, we can check the
lifetime of all the processes created in the Erlang VM. Their lifetime is extremely
short, and they do nothing else (CPU-related) apart from the IO requests. The
requests are serialized at this point because whenever a scheduler starts doing the
system scheduling, it locks the mutex variable “doing_sys_schedule”, and then calls
erts_check_io() function for finding pending IO tasks. These tasks are spread to the
other schedulers and are processed only when the two aforementioned conditions are
fulfilled.

這之前,我們提了相關的patch到erlang/otp團隊了,但是沒有獲准合併,討論見這裡。官方承諾的multiple poll set功能都好幾年也沒見實現。

其中IO排程器的遷移邏輯,見第一篇paper, 簡單的描述可以見這裡

They migration logic does:
collect statistics about the maxlength of all scheduler’s runqueues
setup migration paths
Take away jobs from full-load schedulers and pushing jobs on low load scheduler queues
Running on full load or not! If all schedulers are not fully loaded, jobs will be migrated to schedulers with lower id’s and thus making some schedulers inactive.

在這種情況下,我們只能自己來動手解決了,erlang提供了統計資訊來幫忙我們定位這個問題,我們演示下:

$ erl
Erlang R17A (erts-5.11) [source-e917f6d] [64-bit] [smp:16:16] [async-threads:10] [hipe] [kernel-poll:false] [type-assertions] [debug-compiled] [lock-checking] [systemtap]

Eshell V5.11  (abort with ^G)
1> erlang:system_flag(scheduling_statistics,enable).  
true
2> erlang:system_info(scheduling_statistics).         
[{process_max,0,0},
 {process_high,0,0},
 {process_normal,6234777,0},
 {process_low,0,0},
 {port,6234700,0}]
3> erlang:system_info(total_scheduling_statistics).
[{process_max,0,0},
 {process_high,0,0},
 {process_normal,7063305,0},
 {process_low,0,0},
 {port,7062620,0}]

注意上面的port,這個是指RQ上共排程了多少port操作,port也會在不同的排程器之間來回平衡。
對應下erl_process.c實現原始碼,我們可以知道數字的含義: 第一個數字排程器代表執行的程序數目,第二個為遷移的程序數目。

        for (i = 0; i < ERTS_NO_PRIO_LEVELS; i++) {
            prio[i] = erts_sched_stat.prio[i].name;
            executed[i] = erts_sched_stat.prio[i].total_executed;
            migrated[i] = erts_sched_stat.prio[i].total_migrated;
        }

特別是遷移程序的時候,會引起大量的目標和源佇列鎖操作。如果觀察到遷移的程序數目過多的話,我們就要考慮未公開的程序繫結選項{scheduler, N}在程序建立的時候把某個程序固定在排程器上,避免遷移。 具體操作可以參考這篇

祝玩得開心!

Post Footer automatically generated by wp-posturl plugin for wordpress.