1. 程式人生 > >Flask 原始碼解讀 --- 從請求到響應的流程

Flask 原始碼解讀 --- 從請求到響應的流程

學flask有短時間了,一直想了解原始碼,最近看了大神的一篇部落格分析的很透徹,竟然看懂了.現在也來分析下.

1. 提起Flask,  說下 WSGI:

瞭解了HTTP協議和HTML文件,我們其實就明白了一個Web應用的本質就是:

  1. 瀏覽器傳送一個HTTP請求;

  2. 伺服器收到請求,生成一個HTML文件;

  3. 伺服器把HTML文件作為HTTP響應的Body傳送給瀏覽器;

  4. 瀏覽器收到HTTP響應,從HTTP Body取出HTML文件並顯示。

所以,最簡單的Web應用就是先把HTML用檔案儲存好,用一個現成的HTTP伺服器軟體,接收使用者請求,從檔案中讀取HTML,返回。Apache、Nginx、Lighttpd等這些常見的靜態伺服器就是幹這件事情的。

如果要動態生成HTML,就需要把上述步驟自己來實現。不過,接受HTTP請求、解析HTTP請求、傳送HTTP響應都是苦力活,如果我們自己來寫這些底層程式碼,還沒開始寫動態HTML呢,就得花個把月去讀HTTP規範。

正確的做法是底層程式碼由專門的伺服器軟體實現,我們用Python專注於生成HTML文件。因為我們不希望接觸到TCP連線、HTTP原始請求和響應格式,所以,需要一個統一的介面,讓我們專心用Python編寫Web業務。

這個介面就是WSGI:Web Server Gateway Interface。

2. WSGI具體功能:

wsgi可以起到介面作用, 前面對接伺服器,後面對接app具體功能

WSGI介面定義非常簡單,它只要求Web開發者實現一個函式,就可以響應HTTP請求。我們來看一個最簡單的Web版本的“Hello, web!”:

def application(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    return '<h1>Hello, web!</h1>'

上面的application()函式就是符合WSGI標準的一個HTTP處理函式,它接收兩個引數:

  • environ:一個包含所有HTTP請求資訊的dict

    物件;

  • start_response:一個傳送HTTP響應的函式

3. Flask和WSGI

from flask import Flask

app = Flask(__name__)

@app.route('/')
def index():
    return 'Hello World!'

就這樣flask例項就生成了

但是當呼叫app的時候,實際上呼叫了Flask的__call__方法, 這就是app工作的開始

Flask的__call__原始碼如下:

class Flask(_PackageBoundObject):

# 中間省略

    def __call__(self, environ, start_response):  # Flask例項的__call__方法
        """Shortcut for :attr:`wsgi_app`."""
        return self.wsgi_app(environ, start_response)

所以當請求傳送過來時,呼叫了__call__方法, 但實際上可以看到呼叫的是wsgi_app方法,同時傳入引數 environ和sart_response

來看下wsgi_app怎麼定義的:

    def wsgi_app(self, environ, start_response):
        """The actual WSGI application.  This is not implemented in
        `__call__` so that middlewares can be applied without losing a
        reference to the class.  So instead of doing this::

            app = MyMiddleware(app)

        It's a better idea to do this instead::

            app.wsgi_app = MyMiddleware(app.wsgi_app)

        Then you still have the original application object around and
        can continue to call methods on it.

        .. versionchanged:: 0.7
           The behavior of the before and after request callbacks was changed
           under error conditions and a new callback was added that will
           always execute at the end of the request, independent on if an
           error occurred or not.  See :ref:`callbacks-and-errors`.

        :param environ: a WSGI environment
        :param start_response: a callable accepting a status code,
                               a list of headers and an optional
                               exception context to start the response
        """
        ctx = self.request_context(environ)
        ctx.push()
        error = None
        try:
            try:
                response = self.full_dispatch_request()
            except Exception as e:
                error = e
                response = self.handle_exception(e)
            return response(environ, start_response)
        finally:
            if self.should_ignore_error(error):
                error = None
            ctx.auto_pop(error)

第一步, 生成request請求物件和請求上下文環境 :

可以看到ctx=self.request_context(environ), 設計到了請求上下文和應用上下文的概念, 結構為棧結構,擁有棧的特點.

簡單理解為生成了一個request請求物件以及包含請求資訊在內的request_context

第二步, 請求預處理, 錯誤處理以及請求到響應的過程:

response = self.full_dispatch_request()

響應被賦值了成full_dispatch_request(), 所以看下full_dispatch_request()方法

    def full_dispatch_request(self):
        """Dispatches the request and on top of that performs request
        pre and postprocessing as well as HTTP exception catching and
        error handling.

        .. versionadded:: 0.7
        """
        self.try_trigger_before_first_request_functions()  # 進行請求前的一些處理, 類似中關鍵
        try:
            request_started.send(self)  #  socket的操作
            rv = self.preprocess_request()  # 進行請求預處理
            if rv is None:
                rv = self.dispatch_request()
        except Exception as e:
            rv = self.handle_user_exception(e)
        return self.finalize_request(rv)

先看下try_trigger_before_first_request_functions(), 最終目的是將_got_first_request設為True, 如果是True,就開始處理請求了

    def try_trigger_before_first_request_functions(self):
        """Called before each request and will ensure that it triggers
        the :attr:`before_first_request_funcs` and only exactly once per
        application instance (which means process usually).

        :internal:
        """
        if self._got_first_request:
            return
        with self._before_request_lock:
            if self._got_first_request:
                return
            for func in self.before_first_request_funcs:
                func()
            self._got_first_request = True

got_first_request()定義為靜態方法, 定義中可以看到if  the application started, this attribute is set to True.

class Flask(_PackageBoundObject):
#  省略...
 
    @property
    def got_first_request(self):
        """This attribute is set to ``True`` if the application started
        handling the first request.

        .. versionadded:: 0.8
        """
        return self._got_first_request

回到full_dispatch_request(), 看一下preprocess_request()方法, 也就是flask的鉤子,相當於中間鍵, 可以實現before_request功能.

        try:
            request_started.send(self)
            rv = self.preprocess_request()
            if rv is None:
                rv = self.dispatch_request()
        except Exception as e:
            rv = self.handle_user_exception(e)

對於dispatch_request()方法, 起到分發請求的作用, 一個請求通過url寄來以後,app怎麼知道如何響應呢?就是通過這個方法.

第三步,  請求分發 dispatch_request :

    def dispatch_request(self):
        """Does the request dispatching.  Matches the URL and returns the
        return value of the view or error handler.  This does not have to
        be a response object.  In order to convert the return value to a
        proper response object, call :func:`make_response`.

        .. versionchanged:: 0.7
           This no longer does the exception handling, this code was
           moved to the new :meth:`full_dispatch_request`.
        """
        req = _request_ctx_stack.top.request  # 將棧環境中的請求複製給req
        if req.routing_exception is not None:
            self.raise_routing_exception(req)
        rule = req.url_rule
        # if we provide automatic options for this URL and the
        # request came with the OPTIONS method, reply automatically
        if getattr(rule, 'provide_automatic_options', False) \
           and req.method == 'OPTIONS':
            return self.make_default_options_response()
        # otherwise dispatch to the handler for that endpoint
        return self.view_functions[rule.endpoint](**req.view_args)

這一步主要作用就是將@app.route('/')中的'/'和index函式對應起來,具體分析還是挺麻煩的,至少我沒搞懂.

接下來full_dispatch_request()通過make_response()將rv生成響應, 賦值給response.

那make_response()是如何做到的呢, 看原始碼:

    def make_response(self, rv):
        """Converts the return value from a view function to a real
        response object that is an instance of :attr:`response_class`.

        The following types are allowed for `rv`:

        .. tabularcolumns:: |p{3.5cm}|p{9.5cm}|

        ======================= ===========================================
        :attr:`response_class`  the object is returned unchanged
        :class:`str`            a response object is created with the
                                string as body
        :class:`unicode`        a response object is created with the
                                string encoded to utf-8 as body
        a WSGI function         the function is called as WSGI application
                                and buffered as response object
        :class:`tuple`          A tuple in the form ``(response, status,
                                headers)`` or ``(response, headers)``
                                where `response` is any of the
                                types defined here, `status` is a string
                                or an integer and `headers` is a list or
                                a dictionary with header values.
        ======================= ===========================================

        :param rv: the return value from the view function

        .. versionchanged:: 0.9
           Previously a tuple was interpreted as the arguments for the
           response object.
        """
        status_or_headers = headers = None
        if isinstance(rv, tuple):
            rv, status_or_headers, headers = rv + (None,) * (3 - len(rv))

        if rv is None:
            raise ValueError('View function did not return a response')

        if isinstance(status_or_headers, (dict, list)):
            headers, status_or_headers = status_or_headers, None

        if not isinstance(rv, self.response_class):
            # When we create a response object directly, we let the constructor
            # set the headers and status.  We do this because there can be
            # some extra logic involved when creating these objects with
            # specific values (like default content type selection).
            if isinstance(rv, (text_type, bytes, bytearray)):
                rv = self.response_class(rv, headers=headers,
                                         status=status_or_headers)
                headers = status_or_headers = None
            else:
                rv = self.response_class.force_type(rv, request.environ)

        if status_or_headers is not None:
            if isinstance(status_or_headers, string_types):
                rv.status = status_or_headers
            else:
                rv.status_code = status_or_headers
        if headers:
            rv.headers.extend(headers)

        return rv

第四步, 返回到wsgi_app內部:

    def wsgi_app(self, environ, start_response):
        """The actual WSGI application.  This is not implemented in
        `__call__` so that middlewares can be applied without losing a
        reference to the class.  So instead of doing this::

            app = MyMiddleware(app)

        It's a better idea to do this instead::

            app.wsgi_app = MyMiddleware(app.wsgi_app)

        Then you still have the original application object around and
        can continue to call methods on it.

        .. versionchanged:: 0.7
           The behavior of the before and after request callbacks was changed
           under error conditions and a new callback was added that will
           always execute at the end of the request, independent on if an
           error occurred or not.  See :ref:`callbacks-and-errors`.

        :param environ: a WSGI environment
        :param start_response: a callable accepting a status code,
                               a list of headers and an optional
                               exception context to start the response
        """
        ctx = self.request_context(environ)
        ctx.push()
        error = None
        try:
            try:
                response = self.full_dispatch_request()
            except Exception as e:
                error = e
                response = self.handle_exception(e)
            return response(environ, start_response)
        finally:
            if self.should_ignore_error(error):
                error = None
            ctx.auto_pop(error)

就這樣response從full_dispatch_request()中得到後, 傳入引數environ和start_response, 返回給Gunicorn了.

從HTTP request到response的流程就完畢了.

梳理下流程:

客戶端-----> wsgi server ----> 通過__call__呼叫 wsgi_app,  生成requests物件和上下文環境------> full_dispatch_request功能 ---->通過 dispatch_requests進行url到view function的邏輯轉發, 並取得返回值 ------> 通過make_response函式,將一個view_function的返回值轉換成一個response_class物件------->通過向response物件傳入environ和start_response引數, 將最終響應返回給伺服器.

一個人看完原始碼真的不容易, 沒有點功底確實難做到, 但是堅持下來是不是理解就更深了? 加油, 送給每一位看到這裡的程式猿們...共勉