1. 程式人生 > >HMM-前向後向演算法理解與實現(python)

HMM-前向後向演算法理解與實現(python)

[TOC] ### 基本要素 - 狀態 $N$個 - 狀態序列 $S = s_1,s_2,...$ - 觀測序列 $O=O_1,O_2,...$ - $\lambda(A,B,\pi)$ - 狀態轉移概率 $A = \{a_{ij}\}$ - 發射概率 $B = \{b_{ik}\}$ - 初始概率分佈 $\pi = \{\pi_i\}$ - 觀測序列生成過程 - 初始狀態 - 選擇觀測 - 狀態轉移 - 返回step2 ### HMM三大問題 - 概率計算問題(評估問題) 給定觀測序列 $O=O_1O_2...O_T$,模型 $\lambda (A,B,\pi)$,計算 $P(O|\lambda)$,即計算觀測序列的概率 - 解碼問題 給定觀測序列 $O=O_1O_2...O_T$,模型 $\lambda (A,B,\pi)$,找到對應的狀態序列 $S$ - 學習問題 給定觀測序列 $O=O_1O_2...O_T$,找到模型引數 $\lambda (A,B,\pi)$,以最大化 $P(O|\lambda)$, ### 概率計算問題 給定模型 $\lambda$ 和觀測序列 $O$,如何計算$P(O| \lambda)$? **暴力列舉**每一個可能的狀態序列 $S$ - 對每一個給定的狀態序列 $$ P(O|S,\lambda) = \prod^T_{t=1} P(O_t|s_t,\lambda) =\prod^T_{t=1} b_{s_tO_t} $$ - 一個狀態序列的產生概率 $$ P(S|\lambda) = P(s_1)\prod^T_{t=2}P(s_t|s_{t-1})=\pi_1\prod^T_{t=2}a_{s_{t-1}s_t} $$ - 聯合概率 $$ P(O,S|\lambda) = P(S|\lambda)P(O|S,\lambda) =\pi_1\prod^T_{t=2}a_{s_{t-1}s_t}\prod^T_{t=1} b_{s_tO_t} $$ - 考慮所有的狀態序列 $$ P(O|\lambda)=\sum_S\pi_1b_{s_1O_1}\prod^T_{t=2}a_{s_{t-1}s_t}b_{s_tO_t} $$ $O$ 可能由任意一個狀態得到,所以需要將每個狀態的可能性相加。 這樣做什麼問題?時間複雜度高達 $O(2TN^T)$。每個序列需要計算 $2T$ 次,一共 $N^T$ 個序列。 #### 前向演算法 在時刻 $t$,狀態為 $i$ 時,前面的時刻觀測到 $O_1,O_2, ..., O_t$ 的概率,記為 $\alpha _i(t)$ : $$ \alpha_{i}(t)=P\left(O_{1}, O_{2}, \ldots O_{t}, s_{t}=i | \lambda\right) $$ 當 $t=1$ 時,輸出為 $O_1$,假設有三個狀態,$O_1$ 可能是任意一個狀態發出,即 $$ P(O_1|\lambda) = \pi_1b_1(O_1)+\pi_2b_2(O_1)+\pi_2b_3(O_1) = \alpha_1(1)+\alpha_2(1)+\alpha_3(1) $$ ![image-20200511202908264](https://gitee.com/gongyanzh/blogpic/raw/master/pictures/image-20200511202908264.png) 當 $t=2$ 時,輸出為 $O_1O_2$ ,$O_2$ 可能由任一個狀態發出,同時產生 $O_2$ 對應的狀態可以由 $t=1$ 時刻任意一個狀態轉移得到。假設 $O_2$ 由狀態 `1` 發出,如下圖 ![image-20200511203749699](https://gitee.com/gongyanzh/blogpic/raw/master/pictures/image-20200511203749699.png) $$ P(O_1O_2,s_2=q_1|\lambda) = \pi_1b_1(O_1)a_{11}b_1(O_2)+\pi_2b_2(O_1)a_{21}b_1(O_2)+\pi_2b_3(O_1)a_{31}b_1(O_2) \\ =\bold{\alpha_1(1)}a_{11}b_1(O_2)+\bold{\alpha_2(1)}a_{21}b_1(O_2)+\bold{\alpha_3(1)}a_{31}b_1(O_2) = \bold{\alpha_1(2)} $$ 同理可得 $\alpha_2(2),\alpha_3(2)$ $$ \bold{\alpha_2(2)} = P(O_1O_2,s_2=q_2|\lambda) =\bold{\alpha_1(1)}a_{12}b_2(O_2)+\bold{\alpha_2(1)}a_{22}b_2(O_2)+\bold{\alpha_3(1)}a_{32}b_2(O_2) \\ \bold{\alpha_3(2)} = P(O_1O_2,s_2=q_3|\lambda) =\bold{\alpha_1(1)}a_{13}b_3(O_2)+\bold{\alpha_2(1)}a_{23}b_3(O_2)+\bold{\alpha_3(1)}a_{33}b_3(O_2) $$ 所以 $$ P(O_1O_2|\lambda) =P(O_1O_2,s_2=q_1|\lambda)+ P(O_1O_2,s_2=q_2|\lambda) +P(O_1O_2,s_2=q_3|\lambda)\\ = \alpha_1(2)+\alpha_2(2)+\alpha_3(2) $$ 所以**前向演算法**過程如下: ​ step1:初始化 $\alpha_i(1)= \pi_i*b_i(O_1)$ ​ step2:計算 $\alpha_i(t) = (\sum^{N}_{j=1} \alpha_j(t-1)a_{ji})b_i(O_{t})$ ​ step3:$P(O|\lambda) = \sum^N_{i=1}\alpha_i(T)$ 相比暴力法,時間複雜度降低了嗎? 當前時刻有 $N$ 個狀態,每個狀態可能由前一時刻 $N$ 個狀態中的任意一個轉移得到,所以單個時刻的時間複雜度為 $O(N^2)$,總**時間複雜度**為 $O(TN^2)$ **程式碼實現** 例子: 假設從三個 袋子 `{1,2,3}`中 取出 4 個球 `O={red,white,red,white}`,模型引數$\lambda = (A,B,\pi)$ 如下,計算序列`O`出現的概率 ``` #狀態 1 2 3 A = [[0.5,0.2,0.3], [0.3,0.5,0.2], [0.2,0.3,0.5]] pi = [0.2,0.4,0.4] # red white B = [[0.5,0.5], [0.4,0.6], [0.7,0.3]] ``` ​ step1:初始化 $\alpha_i(1)= \pi_i*b_i(O_1)$ ​ step2:計算 $\alpha_i(t) = (\sum^{N}_{j=1} \alpha_j(t-1)a_{ji})b_i(O_{t})$ ​ step3:$P(O|\lambda) = \sum^N_{i=1}\alpha_i( T)$ ```python #前向演算法 def hmm_forward(A,B,pi,O): T = len(O) N = len(A[0]) #step1 初始化 alpha = [[0]*T for _ in range(N)] for i in range(N): alpha[i][0] = pi[i]*B[i][O[0]] #step2 計算alpha(t) for t in range(1,T): for i in range(N): temp = 0 for j in range(N): temp += alpha[j][t-1]*A[j][i] alpha[i][t] = temp*B[i][O[t]] #step3 proba = 0 for i in range(N): proba += alpha[i][-1] return proba,alpha A = [[0.5,0.2,0.3],[0.3,0.5,0.2],[0.2,0.3,0.5]] B = [[0.5,0.5],[0.4,0.6],[0.7,0.3]] pi = [0.2,0.4,0.4] O = [0,1,0,1] hmm_forward(A,B,pi,O) #結果為 0.06009 ``` 結果 ![image-20200512195503450](https://gitee.com/gongyanzh/blogpic/raw/master/pictures/image-20200512195503450.png) #### 後向演算法 在時刻 $t$,狀態為 $i$ 時,觀測到 $O_{t+1},O_{t+2}, ..., O_T$ 的概率,記為 $\beta _i(t)$ : $$ \beta_{i}(t)=P\left(O_{t+1},O_{t+2}, ..., O_T | s_{t}=i, \lambda\right) $$ 當 $t=T$ 時,由於 $T$ 時刻之後為空,沒有觀測,所以 $\beta_i(t)=1$ 當 $t = T-1$ 時,觀測 $O_T$ ,$O_T$ 可能由任意一個狀態產生 $$ \beta_i(T-1) = P(O_T|s_{t}=i,\lambda) = a_{i1}b_1(O_T)\beta_1(T)+a_{i2}b_2(O_T)\beta_2(T)+a_{i3}b_3(O_T)\beta_3(T) $$ ![image-20200511214910979](https://gitee.com/gongyanzh/blogpic/raw/master/pictures/image-20200511214910979.png) 當 $t=1$ 時,觀測為 $O_{2},O_{3}, ..., O_T$ $$ \begin{aligned} \beta_1(1) &= P(O_{2},O_{3}, ..., O_T|s_1=1,\lambda)\\ &=a_{11}b_1(O_2)\beta_1(2)+a_{12}b_2(O_2)\beta_2(2)+a_{13}b_3(O_2)\beta_3(2) \\ \quad \\ \beta_2(1) &= P(O_{2},O_{3}, ..., O_T|s_1=2,\lambda)\\ &=a_{21}b_1(O_2)\beta_1(2)+a_{22}b_2(O_2)\beta_2(2)+a_{23}b_3(O_2)\beta_3(2) \\ \quad \\ \beta_3(1) &=P(O_{2},O_{3}, ..., O_T|s_1=3,\lambda)\\ &=a_{31}b_1(O_2)\beta_1(2)+a_{32}b_2(O_2)\beta_2(2)+a_{33}b_3(O_2)\beta_3(2) \end{aligned} $$ 所以 $$ P(O_{2},O_{3}, ..., O_T|\lambda) = \beta_1(1)+\beta_2(1)+\beta_3(1) $$ 後向演算法過程如下: ​ step1:初始化 $\beta_i(T)=1$ ​ step2:計算 $\beta_i(t) = \sum^N_{j=1}a_{ij}b_j(O_{t+1})\beta_j(t+1)$ ​ step3:$P(O|\lambda) = \sum^N_{i=1}\pi_ib_i(O_1)\beta_i(1)$ - 時間複雜度 $O(N^2T)$ **程式碼實現** 還是上面的例子 ```python #後向演算法 def hmm_backward(A,B,pi,O): T = len(O) N = len(A[0]) #step1 初始化 beta = [[0]*T for _ in range(N)] for i in range(N): beta[i][-1] = 1 #step2 計算beta(t) for t in reversed(range(T-1)): for i in range(N): for j in range(N): beta[i][t] += A[i][j]*B[j][O[t+1]]*beta[j][t+1] #step3 proba = 0 for i in range(N): proba += pi[i]*B[i][O[0]]*beta[i][0] return proba,beta A = [[0.5,0.2,0.3],[0.3,0.5,0.2],[0.2,0.3,0.5]] B = [[0.5,0.5],[0.4,0.6],[0.7,0.3]] pi = [0.2,0.4,0.4] O = [0,1,0,1] hmm_backward(A,B,pi,O) #結果為 0.06009 ``` 結果 ![image-20200512195526215](https://gitee.com/gongyanzh/blogpic/raw/master/pictures/image-20200512195526215.png) #### 前向-後向演算法 ![image-20200511201506794](https://gitee.com/gongyanzh/blogpic/raw/master/pictures/image-20200511201506794.png) 回顧前向、後向變數: - $a_i(t)$ 時刻 $t$,狀態為 $i$ ,觀測序列為 $O_1,O_2, ..., O_t$ 的概率 - $\beta_i(t)$ 時刻 $t$,狀態為 $i$ ,觀測序列為 $O_{t+1},O_{t+2}, ..., O_T$ 的概率 $$ \begin{aligned} P(O,s_t=i|\lambda) &= P(O_1,O_2, ..., O_T,s_t=i|\lambda)\\ &= P(O_1,O_2, ..., O_t,s_t=i,O_{t+1},O_{t+2}, ..., O_T|\lambda)\\ &= P(O_1,O_2, ..., O_t,s_t=i|\lambda)*P(O_{t+1},O_{t+2}, ..., O_T|O_1,O_2, ..., O_t,s_t=i,\lambda) \\ &= P(O_1,O_2, ..., O_t,s_t=i|\lambda)*P(O_{t+1},O_{t+2}, ..., O_T,s_t=i|\lambda)\\ &= a_i(t)*\beta_i(t) \end{aligned} $$ 即在給定的狀態序列中,$t$ 時刻狀態為 $i$ 的概率。 使用前後向演算法可以計算隱狀態,記 $\gamma_i(t) = P(s_t=i|O,\lambda)$ 表示時刻 $t$ 位於隱狀態 $i$ 的概率 $$ P\left(s_{t}=i, O | \lambda\right)=\alpha_{i}(t) \beta_{i}(t) $$ $$ \begin{aligned} \gamma_{i}(t) &=P\left(s_{t}={i} | O, \lambda\right)=\frac{P\left(s_{t}={i}, O | \lambda\right)}{P(O | \lambda)} \\ &=\frac{\alpha_{i}(t) \beta_{i}(t)}{P(O | \lambda)}=\frac{\alpha_{i}(t) \beta_{i}(t)}{\sum_{i=1}^{N} \alpha_{i}(t) \beta_{i}(t)} \end{aligned} $$ references: [1] https://www.cs.sjsu.edu/~stamp/RUA/HMM.pdf [2]https://www.cnblogs.com/fulcra/p/11065474.html [3] https://www.cnblogs.com/sjjsxl/p/6285629.html [4] https://blog.csdn.net/xueyingxue001/article/details/5