1. 程式人生 > >Scrapy原始碼分析(三):訊號管理器SignalManager

Scrapy原始碼分析(三):訊號管理器SignalManager

類的位置scrapy.signalmanager.SignalManager。主要是對pydispatch.dispatcher的一層封裝。

首先來看看pydispatch.dispatcher都有哪些功能:專案主頁

這個模組主要提供了訊息的傳送和接收功能,主頁的示例:

To set up a function to receive signals:

from pydispatch import dispatcher
SIGNAL = 'my-first-signal'

def handle_event( sender ):
    """Simple event handler"""
    print 'Signal was sent by', sender
dispatcher.connect( handle_event, signal=SIGNAL, sender=dispatcher.Any )

The use of the Any object allows the handler to listen for messages from any Sender or to listen to Any message being sent.  To send messages:

first_sender = object()
second_sender = {}
def main( ):
    dispatcher.send( signal=SIGNAL, sender=first_sender )
    dispatcher.send( signal=SIGNAL, sender=second_sender )

Which causes the following to be printed:

Signal was sent by <object object at 0x196a090>
Signal was sent by {}
一個簡單的的例子:點選開啟連結

目測這個類是非同步事件驅動的。下面來看看SignalManager對其的包裝:

class SignalManager(object):

    def __init__(self, sender=dispatcher.Anonymous):
        self.sender = sender

    def connect(self, receiver, signal, **kwargs):
        """
        Connect a receiver function to a signal.

        The signal can be any object, although Scrapy comes with some
        predefined signals that are documented in the :ref:`topics-signals`
        section.

        :param receiver: the function to be connected
        :type receiver: callable

        :param signal: the signal to connect to
        :type signal: object
        """
        kwargs.setdefault('sender', self.sender)
        return dispatcher.connect(receiver, signal, **kwargs)

    def disconnect(self, receiver, signal, **kwargs):
        """
        Disconnect a receiver function from a signal. This has the
        opposite effect of the :meth:`connect` method, and the arguments
        are the same.
        """
        kwargs.setdefault('sender', self.sender)
        return dispatcher.disconnect(receiver, signal, **kwargs)

    def send_catch_log(self, signal, **kwargs):
        """
        Send a signal, catch exceptions and log them.

        The keyword arguments are passed to the signal handlers (connected
        through the :meth:`connect` method).
        """
        kwargs.setdefault('sender', self.sender)
        return _signal.send_catch_log(signal, **kwargs)

    def send_catch_log_deferred(self, signal, **kwargs):
        """
        Like :meth:`send_catch_log` but supports returning `deferreds`_ from
        signal handlers.

        Returns a Deferred that gets fired once all signal handlers
        deferreds were fired. Send a signal, catch exceptions and log them.

        The keyword arguments are passed to the signal handlers (connected
        through the :meth:`connect` method).

        .. _deferreds: http://twistedmatrix.com/documents/current/core/howto/defer.html
        """
        kwargs.setdefault('sender', self.sender)
        return _signal.send_catch_log_deferred(signal, **kwargs)

    def disconnect_all(self, signal, **kwargs):
        """
        Disconnect all receivers from the given signal.

        :param signal: the signal to disconnect from
        :type signal: object
        """
        kwargs.setdefault('sender', self.sender)
        return _signal.disconnect_all(signal, **kwargs)

1、__init__

初始化self.sender為dispatcher.Anonymous匿名物件

2、connect(self, receiver, signal, **kwargs)

對dispatcher.connect(receiver, signal, **kwargs)的封裝,如果沒有顯示指定sender,則使用dispatcher.Anonymous匿名物件

3、disconnect(self, receiver, signal, **kwargs)

斷開連線,邏輯同connect

4、send_catch_log(self, signal, **kwargs)

是對signal.send_catch_log(signal, **kwargs)的包裝。

def send_catch_log(signal=Any, sender=Anonymous, *arguments, **named):
    """Like pydispatcher.robust.sendRobust but it also logs errors and returns
    Failures instead of exceptions.
    """
    dont_log = named.pop('dont_log', _IgnoredException)
    spider = named.get('spider', None)
    responses = []
    for receiver in liveReceivers(getAllReceivers(sender, signal)):
        try:
            response = robustApply(receiver, signal=signal, sender=sender,
                *arguments, **named)
            if isinstance(response, Deferred):
                logger.error("Cannot return deferreds from signal handler: %(receiver)s",
                             {'receiver': receiver}, extra={'spider': spider})
        except dont_log:
            result = Failure()
        except Exception:
            result = Failure()
            logger.error("Error caught on signal handler: %(receiver)s",
                         {'receiver': receiver},
                         exc_info=True, extra={'spider': spider})
        else:
            result = response
        responses.append((receiver, result))
    return responses

這個函式是對pydispatch.robustapply.robustApply的封裝,使用log記錄錯誤,使用twisted.python.failure.Failure記錄錯誤。

5、send_catch_log_deferred(self, signal, **kwargs)

是對signal.send_catch_log_deferred(signal, **kwargs)的封裝。

def send_catch_log_deferred(signal=Any, sender=Anonymous, *arguments, **named):
    """Like send_catch_log but supports returning deferreds on signal handlers.
    Returns a deferred that gets fired once all signal handlers deferreds were
    fired.
    """
    def logerror(failure, recv):
        if dont_log is None or not isinstance(failure.value, dont_log):
            logger.error("Error caught on signal handler: %(receiver)s",
                         {'receiver': recv},
                         exc_info=failure_to_exc_info(failure),
                         extra={'spider': spider})
        return failure

    dont_log = named.pop('dont_log', None)
    spider = named.get('spider', None)
    dfds = []
    for receiver in liveReceivers(getAllReceivers(sender, signal)):
        d = maybeDeferred(robustApply, receiver, signal=signal, sender=sender,
                *arguments, **named)
        d.addErrback(logerror, receiver)
        d.addBoth(lambda result: (receiver, result))
        dfds.append(d)
    d = DeferredList(dfds)
    d.addCallback(lambda out: [x[1] for x in out])
    return d

感覺Defered是一個placeholder,類似於Tornado的Future。

6、disconnect_all(self, signal, **kwargs)

是對signal.disconnect_all(signal, **kwargs)的封裝

def disconnect_all(signal=Any, sender=Any):
    """Disconnect all signal handlers. Useful for cleaning up after running
    tests
    """
    for receiver in liveReceivers(getAllReceivers(sender, signal)):
        disconnect(receiver, signal=signal, sender=sender)

主要是獲取所有Receivers,斷開連線。

主要使用了pydispatch.dispatcher中的liveReceivers,getAllReceivers, disconnect三個函式,獲取所有的Receivers,檢查是不是live的,然後依次斷開連線。