1. 程式人生 > >系統技術非業餘研究 » 程序死亡原因調查:被殺?

系統技術非業餘研究 » 程序死亡原因調查:被殺?

最近MySQL平臺化系統都是用熱升級來更新的,在線上的日誌發現類似的crashlog:

2013-07-24 23:54:06 =ERROR REPORT====
** Generic server <0.31760.980> terminating
** Last message in was {‘EXIT’,<0.29814.980>,killed}
** When Server state == {state,”app873″,false,172683,33,<>,<0.29814.980>,ump_proxy_session,1,[59,32,204,78,86,208,242,122,240,207,269,79,80],[],2,true,<<>>,0,0,{conn_info,{10,246,161,112},10145,”app873″,”8813684fc05fb6cd”},<0.31762.980>,1,{conn_info,{172,18,134,8},10085,”app873″,”8813684fc05fb6cd”},ump_proxy_cherly_server,<0.31763.980>,1,undefined,<<>>,false,true,[],{dict,2,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[[{session,names},103,98,107]],[],[],[],[],[],[],[[{session,”character_set_results”},78,85,76,76]],[],[],[],[],[],[],[]}}},1,0,0,0,0,0,200,1374681206757732}
** Reason for termination ==
** killed

其中Reason是killed, 有點困擾。

我們知道在熱升級的時候會purge用舊程式碼的程序, purge的時候發現有必要就會exit(P, kill)讓程序死亡,但是怎麼kill變成了killed呢?

我們看下release_handler和code的程式碼,驗證我們的判斷:

%%release_handler_1.erl
eval({purge, Modules}, EvalState) ->
    % Now, if there are any processes still executing old code, OR
    % if some new processes started after suspend but before load,
    % these are killed.
    lists:foreach(fun(Mod) -> code:purge(Mod) end, Modules),
    EvalState;

%%code_server.erl
do_purge([P|Ps], Mod, Purged) ->
    case erlang:check_process_code(P, Mod) of
        true ->
            Ref = erlang:monitor(process, P),
            exit(P, kill),
            receive
                {'DOWN',Ref,process,_Pid,_} -> ok
            end,
            do_purge(Ps, Mod, true);
        false ->
            do_purge(Ps, Mod, Purged)
    end;

release_handler最終確實是呼叫了exit(P, kill)殺人,可是為什麼對端收到killed死因呢?

再深入調查下原因,我們知道exit是虛擬機器內部的實現,簡單的grep下killed就可以看到send_exit_signal這個執行函式:

/*erl_process.c:L8279*/
static ERTS_INLINE int
send_exit_signal(Process *c_p,          /* current process if and only                                                    
                                           if reason is stored on it */
                 Eterm from,            /* Id of sender of signal */
                 Process *rp,           /* receiving process */
                 ErtsProcLocks *rp_locks,/* current locks on receiver */
                 Eterm reason,          /* exit reason */
                 Eterm exit_tuple,      /* Prebuild exit tuple                                                            
                                           or THE_NON_VALUE */
                 Uint exit_tuple_sz,    /* Size of prebuilt exit tuple                                                    
                                           (if exit_tuple != THE_NON_VALUE) */
                 Eterm token,           /* token */
                 Process *token_update, /* token updater */
                 Uint32 flags           /* flags */
    )
{
 Eterm rsn = reason == am_kill ? am_killed : reason;
...
}

這下清楚了,原來如果死亡原因是kill,那麼執行期會好意改成killed.

再回到文件:

exit(Pid, Reason) -> true

Types:

Pid = pid() | port()
Reason = term()
Sends an exit signal with exit reason Reason to the process or port identified by Pid.

The following behavior apply if Reason is any term except normal or kill:

If Pid is not trapping exits, Pid itself will exit with exit reason Reason. If Pid is trapping exits, the exit signal is transformed into a message {‘EXIT’, From, Reason} and delivered to the message queue of Pid. From is the pid of the process which sent the exit signal. See also process_flag/2.

If Reason is the atom normal, Pid will not exit. If it is trapping exits, the exit signal is transformed into a message {‘EXIT’, From, normal} and delivered to its message queue.

If Reason is the atom kill, that is if exit(Pid, kill) is called, an untrappable exit signal is sent to Pid which will unconditionally exit with exit reason killed.

文件寫的很清楚,只是我們看的時候沒在意。

小結:個人感覺有點多此一舉,忠實於使用者的指示比較好,不至於造成困擾。

祝玩得開心!

Post Footer automatically generated by wp-posturl plugin for wordpress.