1. 程式人生 > >Watchdog

Watchdog

A watchdog is a fixed-length counter that enables a system to recover from an unexpected hardware or software catastrophe. Unless the system periodically resets the watchdog timer, the watchdog timer assumes a catastrophe and tries to handle the situation. In general, there are two kinds of watchdog implementations, hardware watchdog and a software watchdog based on timer interrupt. Both software and hardware watchdogs are used in the system. The software watchdog is not implemented in all of the subsystems, e.g., RPM, TZ, and APSS (Android) do not have software watchdogs, while LPASS and MPSS implement software watchdog. The hardware watchdog module is a piece of hardware that is used to ensure that the processor is not stuck or overloaded, and consists of a timer that counts down from a predetermined value. If the timer is not reset (also known as a dog_kick or petting the dog or servicing the dog) by the corresponding CPU core, it eventually counts to 0 and triggers a watchdog timeout. It is the responsibility of each CPU core to ensure that it keeps resetting the counter. If it is unable to do so (if the dedicated task is starved or the CPU core is locked up, etc.), it is assumed that the system has gone into a bad state. Historically, MSM ASICs have used one watchdog timer for the chip system. The modem software was responsible for resetting the watchdog (kicking or petting the dog) and for checking that other processors in the system were functional by doing periodic checks on them (handshaking through interrupt lines). In addition to the reset triggering signal (WatchDog_expired), it is also possible to generate a watchdog interrupt before the watchdog expiration to allow a processor to attempt the recovery of the system before resetting it:

 Bark (FIQ) – Interrupt before the watchdog expires to allow the processer to attempt the recovery of the system before resetting it   Bite (Reset) – When watchdog timeout happens The watchdog timer continues counting even after Bark occurs. However, it stops counting upon halting a CPU via JTAG debugger.