1. 程式人生 > >系統技術非業餘研究 » Erlang match_spec引擎介紹和應用

系統技術非業餘研究 » Erlang match_spec引擎介紹和應用

match_spec是什麼呢?

A “match specification” (match_spec) is an Erlang term describing a small “program” that will try to match something (either the parameters to a function as used in the erlang:trace_pattern/2 BIF, or the objects in an ETS table.). The match_spec in many ways works like a small function in Erlang, but is interpreted/compiled by the Erlang runtime system to something much more efficient than calling an Erlang function. The match_spec is also very limited compared to the expressiveness of real Erlang functions.

具體參見這裡
說白了它就是個erlang term得過濾器,可以讓使用者來自己選擇需要匹配什麼,需要從term裡面抽取什麼資料。那同學可能就有疑問了,Erlang的函式不是很強大嗎,它能做的函式也能做,那為什麼要重新費勁做一個呢?
Erlang實現這個match_spec得原因有2個:1. 執行效率 2. 小巧可以在執行期使用。

它的實現思路是: match_spec是個引擎,有自己的語法,先把語句編譯成專用的opcode, 然後在在匹配的時候執行opcode,獲取結果,可以理解為erlang的DSL。

接下來我帶大家先感性的認識下這個DSL:

// erl_db_util.c:L5038
#ifdef DMC_DEBUG

/*                                                                                                                           
** Disassemble match program                                                                                                 
*/
void db_match_dis(Binary *bp)
{
...
}
#endif /* DMC_DEBUG */

從上面程式碼我們知道如何編譯出一個帶matchspec反彙編功能的beam.smp,步驟如下:

$ cd otp
$ export ERL_TOP=`pwd`
$ cd erts
$ make debug FLAVOR=smp DMC_DEBUG=1
$ sudo make install

有了支援反彙編的執行期,我們來試驗看下matchspec的opcode:

$ erl
Erlang R14B04 (erts-5.8.5) [/source] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.8.5  (abort with ^G)
1> Spec=ets:fun2ms(fun({A, tag}) when A>1 ->A end).
[{{'$1',tag},[{'>','$1',1}],['$1']}]
%%指示開啟反彙編標誌
2> erlang:match_spec_test({1234,tag}, Spec, dis).  
true
3> erlang:match_spec_test({1234,tag}, Spec, table).
Tuple	2
Bind	1
Eq  	tag
PushC	1
PushV	1
Call2	'>'
True
Catch
PushVResult	1
Return
Halt


term_save: {}
num_bindings: 2
heap_size: 24
stack_offset: 12
text: 0x006e3ecc
stack_size: 12 (words)
{ok,1234,[],[]}
4> 

可以看出它有自己的opcode,有自己的vm, 而且語法看起來很不直觀。為了方便大家的使用,Erlang提供了一種方法把函式翻譯成match_spec. 這個模組就是是ms_transform, 參見這裡
我們通常通過dbg:fun2ms或者ets:fun2ms來翻譯的,我們來看下如何實現的:
看下ms_transform.hrl程式碼只有一句:

-compile({parse_transform,ms_transform}).
%%ms_transform.erl
copy({call,Line,{remote,_Line2,{atom,_Line3,ets},{atom,_Line4,fun2ms}},
      As0},Bound) ->
    {transform_call(ets,Line,As0,Bound),Bound};
copy({call,Line,{remote,_Line2,{record_field,_Line3,
                                {atom,_Line4,''},{atom,_Line5,ets}},
                 {atom,_Line6,fun2ms}}, As0},Bound) ->
    %% Packages...                                                                                                           
    {transform_call(ets,Line,As0,Bound),Bound};
copy({call,Line,{remote,_Line2,{atom,_Line3,dbg},{atom,_Line4,fun2ms}},
      As0},Bound) ->
    {transform_call(dbg,Line,As0,Bound),Bound};

從程式碼可以看出,ms_transform只對dbg:fun2ms或者ets:fun2ms做matchspec的編譯期計算,
我們來驗證下我們的理解:

$ cat veri.erl
-module(veri).
-export([start/0]).
-include_lib("stdlib/include/ms_transform.hrl").

start()->
    ets:fun2ms(fun({M,N}) when N > 3 -> M end).

$ erlc +"'S'" veri.erl
$ cat veri.S
{module, veri}.  %% version = 0

{exports, [{module_info,0},{module_info,1},{start,0}]}.

{attributes, []}.

{labels, 7}.


{function, start, 0, 2}.
  {label,1}.
    {func_info,{atom,veri},{atom,start},0}.
  {label,2}.
    {move,{literal,[{{'$1','$2'},[{'>','$2',3}],['$1']}]},{x,0}}.
    return.


{function, module_info, 0, 4}.
  {label,3}.
    {func_info,{atom,veri},{atom,module_info},0}.
  {label,4}.
    {move,{atom,veri},{x,0}}.
    {call_ext_only,1,{extfunc,erlang,get_module_info,1}}.


{function, module_info, 1, 6}.
  {label,5}.
    {func_info,{atom,veri},{atom,module_info},1}.
  {label,6}.
    {move,{x,0},{x,1}}.
    {move,{atom,veri},{x,0}}.
    {call_ext_only,2,{extfunc,erlang,get_module_info,2}}.

$ erl 
Erlang R14B04 (erts-5.8.5)  [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.8.5  (abort with ^G)
1> veri:start().
[{{'$1','$2'},[{'>','$2',3}],['$1']}]

從彙編碼,我們可以看出 ets:fun2ms(fun({M,N}) when N > 3 -> M end). 的結果是編譯期間已經出來了,也就是說它在執行期不會佔用執行時間的,用的時候不用擔心效能問題的。

那麼如何使用呢?ETS給我們很好的例子:

match_spec_compile(MatchSpec) -> CompiledMatchSpec

Types:
MatchSpec = match_spec()
CompiledMatchSpec = comp_match_spec()

This function transforms a match_spec into an internal representation that can be used in subsequent calls to ets:match_spec_run/2. The internal representation is opaque and can not be converted to external term format and then back again without losing its properties (meaning it can not be sent to a process on another node and still remain a valid compiled match_spec, nor can it be stored on disk). The validity of a compiled match_spec can be checked using ets:is_compiled_ms/1.

If the term MatchSpec can not be compiled (does not represent a valid match_spec), a badarg fault is thrown.
Note

This function has limited use in normal code, it is used by Dets to perform the dets:select operations.

match_spec_run(List,CompiledMatchSpec) -> list()

Types:
List = [ tuple() ]
CompiledMatchSpec = comp_match_spec()

This function executes the matching specified in a compiled match_spec on a list of tuples. The CompiledMatchSpec term should be the result of a call to ets:match_spec_compile/1 and is hence the internal representation of the match_spec one wants to use.

The matching will be executed on each element in List and the function returns a list containing all results. If an element in List does not match, nothing is returned for that element. The length of the result list is therefore equal or less than the the length of the parameter List. The two calls in the following example will give the same result (but certainly not the same execution time…):

Table = ets:new…
MatchSpec = ….
% The following call…
ets:match_spec_run(ets:tab2list(Table),
ets:match_spec_compile(MatchSpec)),
% …will give the same result as the more common (and more efficient)
ets:select(Table,MatchSpec),

Note

This function has limited use in normal code, it is used by Dets to perform the dets:select operations and by Mnesia during transactions.

$ grep -rin db_match_dbterm .
./erl_db_hash.c:1307:       (match_res = db_match_dbterm(&tb->common, p, mp, all_objects,
./erl_db_hash.c:1472:           match_res = db_match_dbterm(&tb->common, p, mpi.mp, 0,
./erl_db_hash.c:1638:           if (db_match_dbterm(&tb->common, p, mpi.mp, 0,
./erl_db_hash.c:1787:       if (db_match_dbterm(&tb->common, p, mpi.mp, 0,
./erl_db_hash.c:1898:       if (db_match_dbterm(&tb->common, p, mp, 0,
./erl_db_hash.c:1998:       if (db_match_dbterm(&tb->common, p, mp, 0, &current->dbterm,
./erl_db_tree.c:2971:    ret = db_match_dbterm(&tb->common,sc->p,sc->mp,sc->all_objects,
./erl_db_tree.c:3004:    ret = db_match_dbterm(&tb->common, sc->p, sc->mp, 0,
./erl_db_tree.c:3036:    ret = db_match_dbterm(&tb->common, sc->p, sc->mp, sc->all_objects,
./erl_db_tree.c:3073:    ret = db_match_dbterm(&tb->common, sc->p, sc->mp, 0,

總結:match_spec可以用在我們自己實現的driver模組裡面用來做過濾和匹配。

祝玩得開心!

Post Footer automatically generated by wp-posturl plugin for wordpress.