1. 程式人生 > >如何用C++ 寫Python模塊擴展(二)

如何用C++ 寫Python模塊擴展(二)

unsigned 結構 num ins head stl methods 兩個 spec

Python模塊包含的類創建(下)

  1. 類的方法表創建
    • 直接上代碼

       static PyMethodDef VCam_MethodMembers[] =      //類的所有成員函數結構列表同樣是以全NULL結構結束
      {
          { "set_fill", (PyCFunction)VCam_SetFill, METH_VARARGS, "Set video resize method (0: Aspect fit, 1: Aspect fill, 2: Stretch), used when input frame size differs from VCam output size." },
          { "mirror", (PyCFunction)VCam_Mirror, METH_VARARGS, "Mirror the output video (0: no mirror, others: mirror), non-persistent." },
          { "rotate", (PyCFunction)VCam_Rotate, METH_VARARGS, "Rotate the input video 90 degree (0: no rotate, others: rotate), non-persistent." },
          { "flip", (PyCFunction)VCam_Flip, METH_VARARGS, "Vertical flip the output video(0: no flip, others : flip), non - persistent." },   
          { "set_difault_image", (PyCFunction)VCam_SetDefaultImage, METH_VARARGS, "Set a 24bits bitmap file as VCam default idle image, which will be displayed when nothing is being played.\nCall it with NULL parameter or an empty string will reset it to the default one.\n The image will be resized(aspect fit) only if it‘s bigger than VCam output size." },
          { "set_name", (PyCFunction)VCam_SetFriendlyName, METH_VARARGS, " The device‘s name is \"Virtual Camera\" by default, and you can use it to set a different name." },
          { "set_license", (PyCFunction)VCam_SetLicenseCode, METH_VARARGS, "You can set license code here if you‘ve purchased VCam SDK. The water mark (TRIAL) will be removed with a valid license, and call it with a wrong one will show the watermark again." },  
          { "set_output_format", (PyCFunction)VCam_Format, METH_VARARGS, "set display format (width,height,fps) ." },
          { "send_image", (PyCFunction)VCam_SendImg, METH_VARARGS, "Display a image( path) to vCam." },
          { "capture_screen", (PyCFunction)VCam_CaptureScreen, METH_VARARGS, "Capture region of screen and set it as VCam output." },
          { "get_output_format", (PyCFunction)VCam_GetOutputFormat, METH_NOARGS, "Get VCam output video size (640x480 by default), and frame rate (25 by default)." },
          { NULL, NULL, NULL, NULL }
      };
    • PyMethondDef 結構的定義

      struct PyMethodDef {
          const char  *ml_name;   /* The name of the built-in function/method */
          PyCFunction ml_meth;    /* The C function that <isindex></isindex>mplements it */
          int         ml_flags;   /* Combination of METH_xxx flags, which mostly
                                     describe the args expected by the C func */
          const char  *ml_doc;    /* The __doc__ attribute, or NULL */
      };
      typedef struct PyMethodDef PyMethodDef;
      
      #define PyCFunction_New(ML, SELF) PyCFunction_NewEx((ML), (SELF), NULL)
      PyAPI_FUNC(PyObject *) PyCFunction_NewEx(PyMethodDef *, PyObject *,
                                               PyObject *);
      
      /* Flag passed to newmethodobject */
      /* #define METH_OLDARGS  0x0000   -- unsupported now */
      #define METH_VARARGS  0x0001
      #define METH_KEYWORDS 0x0002
      /* METH_NOARGS and METH_O must not be combined with the flags above. */
      #define METH_NOARGS   0x0004
      #define METH_O        0x0008
      
      /* METH_CLASS and METH_STATIC are a little different; these control
         the construction of methods for a class.  These cannot be used for
         functions in modules. */
      #define METH_CLASS    0x0010
      #define METH_STATIC   0x0020
      
      /* METH_COEXIST allows a method to be entered even though a slot has
         already filled the entry.  When defined, the flag allows a separate
         method, "__contains__" for example, to coexist with a defined
         slot like sq_contains. */
      
      #define METH_COEXIST   0x0040
    • 其他沒啥好說的結構定義已經很明白了,就是第三個元素ml_falg 需要根據函數時機傳入參數要求進行調整 就說幾個常用的flag 其他見手冊
      • METH_NOARGS 表示沒有參數傳入,
      • METH_KEYWORDS 表示傳入keyword參數
      • METH_VARARGS 表示傳入位置參數
      • 部分flag可以組合傳入如 METH_VARARGS|METH_KEYWORDS
      • 註意METH_NOARGS 不能與 前面兩個flag組合使用
  2. 寫類的內置屬性信息表說明PyTypeObject實例VCam_ClassInfo
    • 直接上代碼,代碼中包含了PyTypeObject結構體 大部分元素,具體見object.h 頭文件定義

      static PyTypeObject VCam_ClassInfo =
      {
          PyVarObject_HEAD_INIT(NULL, 0)
          "PyVcam.VCam",            //可以通過__class__獲得這個字符串. CPP可以用類.__name__獲取.   const char *
          sizeof(VCam),                 // tp_basicsize 類/結構的長度.調用PyObject_New時需要知道其大小.  Py_ssize_t
          0,                              //tp_itemsize  Py_ssize_t
          (destructor)VCam_Destruct,    //類的析構函數.      destructor
          0,                            //類的print 函數      printfunc
          0,                               //類的getattr 函數  getattrfunc
          0,                              //類的setattr 函數   setattrfunc
          0,                              //formerly known as tp_compare(Python 2) or tp_reserved (Python 3)  PyAsyncMethods *
          0,          //tp_repr 內置函數調用。    reprfunc
          0,                              //tp_as_number   指針   PyNumberMethods *
          0,                              //tp_as_sequence 指針   PySequenceMethods *
          0,                              // tp_as_mapping 指針  PyMappingMethods *
          0,                              // tp_hash   hashfunc
          0,                              //tp_call     ternaryfunc
          0,          //tp_str/print內置函數調用.   reprfunc
          0,                          //tp_getattro    getattrofunc
          0,                          //tp_setattro     setattrofunc
          0,                          //tp_as_buffer 指針 Functions to access object as input/output buffer   PyBufferProcs
          Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,                 //tp_flags    如果沒有提供方法的話,為Py_TPFLAGS_DEFAULE     unsigned long
          "VCam Module write by C++!",                   // tp_doc  __doc__,類/結構的DocString.  const char *
          0,                                                   //tp_traverse    call function for all accessible objects    traverseproc
          0,                                                      // tp_clear   delete references to contained objects    inquiry
          0,                                                      //  tp_richcompare   richcmpfunc
          0,                                                      //tp_weaklistoffset  Py_ssize_t
          0,                                                      // tp_iter  getiterfunc
          0,                                          //tp_iternext    iternextfunc
      
      
          /* Attribute descriptor and subclassing stuff */
      
          VCam_MethodMembers,        //類的所有方法集合.   PyMethodDef *
          VCam_DataMembers,          //類的所有數據成員集合.  PyMemberDef *
          0,                                              // tp_getset   PyGetSetDef *
          0,                                              //  tp_base   _typeobject *
          0,                                              //  tp_dict     PyObject *
          0,                                              // tp_descr_get   descrgetfunc
          0,                                                          //tp_descr_set   descrsetfunc
          0,                                              //tp_dictoffset   Py_ssize_t
          (initproc)VCam_init,      //類的構造函數.tp_init       initproc
          0,                      //tp_alloc  allocfunc
          0,                          //tp_new   newfunc
          0,                        // tp_free    freefunc
          0,                      //  tp_is_gc     inquiry
      };
    • VCam_ClassInfo 中把前面所創建的 init函數、析構函數、方法表、成員表等加入類信息表

模塊創建和初始化

  1. 創建模塊信息
  • 直接上代碼

       static PyModuleDef ModuleInfo =
        {
        PyModuleDef_HEAD_INIT,
        "PyVcam",               //模塊的內置名--__name__.
        NULL,                 //模塊的DocString.__doc__
        -1,
        NULL, NULL, NULL, NULL, NULL
        };
    • PyModuleDef 結構題定義

      typedef struct PyModuleDef{
        PyModuleDef_Base m_base;
        const char* m_name;
        const char* m_doc;
        Py_ssize_t m_size;
        PyMethodDef *m_methods;
        struct PyModuleDef_Slot* m_slots;
        traverseproc m_traverse;
        inquiry m_clear;
        freefunc m_free;
      } PyModuleDef;
  1. 初始化模塊
    • 先上代碼

      PyMODINIT_FUNC PyInit_PyVcam(void)       //模塊外部名稱為--PyVcam
      {
      
          Gdiplus::GdiplusStartupInput StartupInput;
          GdiplusStartup(&m_gdiplusToken, &StartupInput, NULL);
      
          PyObject* pReturn = 0;
          VCam_ClassInfo.tp_new = PyType_GenericNew;       //此類的new內置函數—建立對象.
      
      
          if (PyType_Ready(&VCam_ClassInfo) < 0)
              return NULL;
      
          pReturn = PyModule_Create(&ModuleInfo);
          if (pReturn == NULL)
              return NULL;
          Py_INCREF(&VCam_ClassInfo);
          PyModule_AddObject(pReturn, "VCam", (PyObject*)&VCam_ClassInfo); //將這個類加入到模塊的Dictionary中.
      
          return pReturn;
      }
    • 代碼解釋:
      • Python模塊必須要導出一個返回值為PyObject*名為PyInit_XXX的函數用來初始化模塊信息,Python加載模塊時候回去直接調用此函數來初始化。
      • PyMODINIT_FUNC 宏其實就是以下語句: __declspec(dllexport) PyObject*
      • VCam_ClassInfo.tp_new = PyType_GenericNew; 這條語句其實可以不用寫直接在前面VCam_Classinfo裏面對應位置加入PyType_GenericNew 即可,想想找到對應那個置要找到眼花 幹脆直接以這種形式寫出來;反之前面整個VCam_Classinfo 後面的結構其實可以不寫,直接以VCam_Classinfo.xxx =xxx的形式寫出
      • 調用一個PyType_Ready (&VCam_ClassInfo)來完成類的定義
      • 然後用PyModule_Create(&ModuleInfo) 創建Module
      • 調用PyModule_AddObject將 Vcam類加入到Module 中 同時別忘了增加類體引用計數
      • 將模塊返回給Python 大功告成

PythonC擴展的執行效率問題(GIL)

1.GIL問題
* GIL鎖原理

    for (;;) {
        if (--ticker < 0) {   //這是之前版本的GIL鎖原理 執行check_interval條數指令 放一次GIL 貌似現在版本不再按照指令條數來放鎖了而是按照時間間隔
            ticker = check_interval;             
            /* Give another thread a chance */
            PyThread_release_lock(interpreter_lock);   //釋放 GIL         
            /* Other threads may run now */             
            PyThread_acquire_lock(interpreter_lock, 1); //立馬重新申請GIL 一放一搶 其他線程就有機會
        }             
        bytecode = *next_instr++;  //這裏讀入python指令
        switch (bytecode) {  
            /* execute the next instruction ... */  //執行指令
        }
    }

* 由於CPython GIL存在在進行多線程任務時 python指令在執行時會一直占著GIL導致其他線程一直在等著搶鎖 於是多線程就編程了單線程,無論你開多少個線程貌似都只能同時有一個線程在運行
  1. GIL鎖問題的解決
    在純Python環境下CPython的GIL貌似無解了,但是GIL真的無解了麽?
    • 大家都知道IO密集型場景利用多線程能顯著提高執行效率,也就是說IO任務執行過程中釋放了GIL 顯然這個釋放肯定不是在ticker<0時釋放的, IO任務到底是怎麽釋放GIL的呢
      • IO任務釋放原理如下

        /* s.connect((host, port)) method */
        static PyObject *
        sock_connect(PySocketSockObject *s, PyObject *addro)
        {
            sock_addr_t addrbuf;
            int addrlen;
            int res;
        
            /* convert (host, port) tuple to C address */
            getsockaddrarg(s, addro, SAS2SA(&addrbuf), &addrlen);
        
            Py_BEGIN_ALLOW_THREADS
            res = connect(s->sock_fd, addr, addrlen);
            Py_END_ALLOW_THREADS
        
            /* error handling and so on .... */
        }
      • 上面是部分socket代碼,可以看到在執行 connect之前 調用了一個宏 Py_BEGIN_ALLOW_THREADS 這個宏就是用來釋放GIL的 成功connect後又調用 Py_END_ALLOW_THREADS重新申請GIL
      • GIL問題迎刃而解

    • 誰說計算密集型不能用多線程,似乎利用C++寫一個模塊來處理計算任務多線程照樣能達到並行效果
      • 下面就開始寫代碼驗證這個問題
      • 在c++模塊中寫了兩個計算密集型函數,函數計算返回之類的算法都沒有區別唯一區別就是: 期中一個函數在高密度計算前釋放了GIL計算完成後重新申請鎖

         static PyObject* Gil_free(GilTest* self,PyObject* args){    
            LONGLONG num;
            if (!PyArg_ParseTuple(args, "L", &num))return NULL;
        
            LONGLONG rst;
            Py_BEGIN_ALLOW_THREADS
                for (LONGLONG i = 1; i <= num * 100; i++)
                {
                    for (LONGLONG j = 1; j <= num * 100; j++)
                    {
                        rst = i*j;
                    }
                }                
            Py_END_ALLOW_THREADS
                return Py_BuildValue("i", rst);
        }
        static PyObject* Gil_lock(GilTest* self, PyObject* args){
            LONGLONG num;
            if (!PyArg_ParseTuple(args, "L", &num))return NULL; 
            LONGLONG rst;
            for (LONGLONG i = 1; i <= num * 100; i++)
                {
                    for (LONGLONG j = 1; j <= num * 100; j++)
                    {
                         rst = i*j;
                    }
                }
            return Py_BuildValue("i", rst);
        }
      • 將函數封裝到一個python模塊中調用模塊 寫一個腳本開多線程執行

        from GilTest import GilTest
        import time
        from threading import Thread
        
        
        def foo(num, i, start):
            obj = GilTest()
            obj.compute_with_gil(num)   # 調用的函數計算時沒有釋放GIL
            print("foo %s is over" % i, time.time() - start)
        
        
        def bar(num, i, start):
            obj = GilTest()
            obj.compute_without_gil(num)   # 調用的函數計算時釋放GIL
            print("bar %s is over" % i, time.time() - start)
        
        
        def run():
            print("stat foo")
            start = time.time()  # 開foo線程開始計時
            thread_list1 = []
            for i in range(10):
                thread_list1.append(Thread(target=foo, args=(1000, i, start)))
            for i in thread_list1:
                i.start()
            for i in thread_list1:
                i.join()
            print("stat bar")
            time.sleep(1) 
            start = time.time()   # 開bar線程開始計時
            thread_list2 = []
            for i in range(10):
                thread_list2.append(Thread(target=bar, args=(1000, i, start)))
            for i in thread_list2:
                i.start()
            for i in thread_list2:
                i.join()
        
        
        if __name__ == ‘__main__‘:
        
            run()
      • 輸出執行結果

        stat foo
        foo 0 is over 2.2932560443878174
        foo 1 is over 4.577575445175171
        foo 2 is over 6.859208583831787
        foo 3 is over 9.145148277282715
        foo 4 is over 11.43115520477295
        foo 5 is over 13.71883225440979
        foo 6 is over 15.999829292297363
        foo 7 is over 18.281397581100464
        foo 8 is over 20.57776975631714
        foo 9 is over 22.851707935333252
        stat bar
        bar 3 is over 4.594241380691528
        bar 6 is over 4.594241380691528
        bar 7 is over 4.609868288040161
        bar 2 is over 4.63910174369812
        bar 8 is over 5.750362157821655
        bar 4 is over 5.765988826751709
        bar 5 is over 5.859748840332031
        bar 0 is over 5.859748840332031
        bar 1 is over 5.859748840332031
        bar 9 is over 5.937881946563721
        
        Process finished with exit code 0
      • 可以發現foo線程完全像是在運行單線程,每個線程執行完成時間比上一個線程大約多2.3秒看;而bar線程是真正的多線程 ,線程完成計算時間差別很小,而且完成先後順序是亂序的,因為CPU是四核的所以線程之間還是會存在搶cpu情況,每個線程運行時間較foo要長一點(foo每個線程的運算幾乎是獨占運行)
      • 再來看看CPU占用
        技術分享圖片

      技術分享圖片

      在執行foo時python進程占用CPU約15%作用,當程序執行到bar線程時可以看到python進程cpu占用直線上飆到接近100%的占用,這也說明了此時python的線程是並行的。

如何用C++ 寫Python模塊擴展(二)