1. 程式人生 > >Coursera概率圖模型(Probabilistic Graphical Models)第一周編程作業分析

Coursera概率圖模型(Probabilistic Graphical Models)第一周編程作業分析

期望 and find 不同的 列表 mali 一周 模型 course

Computing probability queries in a Bayesian network

計算貝葉斯網絡中的概率查詢

1.基礎因子操作

作業中因子的結構

phi = struct(‘var‘, [3 1 2], ‘card‘, [2 2 2], ‘val‘, ones(1, 8));

其中:var表示因子中變量的標簽及順序,card代表基數,描述了各變量的狀態數量,val表示各變量取不同值時對應的概率分布,其向量長度等於prod(card)。

FactorProduct.m 計算兩個因子的積

輸入:

FACTORS.INPUT(1) = struct(‘var‘, [1], ‘card‘, [2], ‘val‘, [0.11, 0.89]);

FACTORS.INPUT(2) = struct(‘var‘, [2, 1], ‘card‘, [2, 2], ‘val‘, [0.59, 0.41, 0.22, 0.78]);

FACTORS.PRODUCT = FactorProduct(FACTORS.INPUT(1), FACTORS.INPUT(2));

期望輸出:

FACTORS.PRODUCT = struct(‘var‘, [1, 2], ‘card‘, [2, 2], ‘val‘, [0.0649, 0.1958, 0.0451, 0.6942]);

我們知道,對貝葉斯網絡而言,因子積其實就是表示貝葉斯鏈式法則。比如若FACTORS.INPUT(1) 表示學生的智力是否正常的分布,即技術分享圖片

,FACTORS.INPUT(2)表示學生在其智力是否正常的條件下考試是否及格的分布,即技術分享圖片,則其聯合概率分布可記為FACTORS.PRODUCT = FactorProduct(FACTORS.INPUT(1), FACTORS.INPUT(2)),即技術分享圖片

計算步驟很簡單,就是貝葉斯鏈式法則的步驟:

技術分享圖片

參考代碼如下:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% YOUR CODE HERE:

% Correctly populate the factor values of C

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

for ii = 1 : length(C.val)

C.val(ii) = A.val(indxA(ii)) * B.val(indxB(ii));

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

FactorMarginalization.m 計算因子的邊緣分布

輸入:

FACTORS.INPUT(2) = struct(‘var‘, [2, 1], ‘card‘, [2, 2], ‘val‘, [0.59, 0.41, 0.22, 0.78]);

FACTORS.MARGINALIZATION = FactorMarginalization(FACTORS.INPUT(2), [2]);

期望輸出:

FACTORS.MARGINALIZATION = struct(‘var‘, [1], ‘card‘, [2], ‘val‘, [1 1]);

本質上,求邊緣分布就是一個求和的過程。對相應變量的值求和就可以了。

參考代碼如下:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% YOUR CODE HERE

% Correctly populate the factor values of B

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

for ii = 1 : length(unique(indxB))

B.val(ii) = sum(A.val(indxB == ii));

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

ObserveEvidence.m 變量觀測

輸入:

FACTORS.INPUT(1) = struct(‘var‘, [1], ‘card‘, [2], ‘val‘, [0.11, 0.89]);

FACTORS.INPUT(2) = struct(‘var‘, [2, 1], ‘card‘, [2, 2], ‘val‘, [0.59, 0.41, 0.22, 0.78]);

FACTORS.INPUT(3) = struct(‘var‘, [3, 2], ‘card‘, [2, 2], ‘val‘, [0.39, 0.61, 0.06, 0.94]);

FACTORS.EVIDENCE = ObserveEvidence(FACTORS.INPUT, [2 1; 3 2]);

期望輸出:

FACTORS.EVIDENCE(1) = struct(‘var‘, [1], ‘card‘, [2], ‘val‘, [0.11, 0.89]);

FACTORS.EVIDENCE(2) = struct(‘var‘, [2, 1], ‘card‘, [2, 2], ‘val‘, [0.59, 0, 0.22, 0]);

FACTORS.EVIDENCE(3) = struct(‘var‘, [3, 2], ‘card‘, [2, 2], ‘val‘, [0, 0.61, 0, 0]);

在ObserveEvidence函數中,第二個參數為一個技術分享圖片的矩陣,第一列表示所觀測的變量,第二列表示對應變量的取值。要求只保留因子中被觀測變量所對應取值的概率,被觀測變量的其他取值對應概率置0。未被觀測變量不受影響。

參考代碼如下:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% YOUR CODE HERE

% Adjust the factor F(j) to account for observed evidence

% Hint: You might find it helpful to use IndexToAssignment

% and SetValueOfAssignment

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

assignments = IndexToAssignment(1 : length(F(j).val), F(j).card);

F(j).val(assignments(:, indx) ~= x) = 0;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2.計算聯合分布

ComputeJointDistribution.m 計算貝葉斯網絡的聯合概率分布

輸入:

FACTORS.INPUT(1) = struct(‘var‘, [1], ‘card‘, [2], ‘val‘, [0.11, 0.89]);

FACTORS.INPUT(2) = struct(‘var‘, [2, 1], ‘card‘, [2, 2], ‘val‘, [0.59, 0.41, 0.22, 0.78]);

FACTORS.INPUT(3) = struct(‘var‘, [3, 2], ‘card‘, [2, 2], ‘val‘, [0.39, 0.61, 0.06, 0.94]);

FACTORS.JOINT = ComputeJointDistribution(FACTORS.INPUT);

期望輸出:

FACTORS.JOINT = struct(‘var‘, [1, 2, 3], ‘card‘, [2, 2, 2], ‘val‘, [0.025311, 0.076362, 0.002706, 0.041652, 0.039589, 0.119438, 0.042394, 0.652548]);

如前所述,在貝葉斯網絡中,聯合概率分布就是其因子積。下面是不同的表述:

技術分享圖片

參考代碼如下:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% YOUR CODE HERE:

% Compute the joint distribution defined by F

% You may assume that you are given legal CPDs so no input checking is required.

%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Joint = F(1);

for ii = 2 : length(F)

Joint = FactorProduct(Joint, F(ii));

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

3.計算邊緣分布

ComputeMarginal.m 計算貝葉斯網絡的邊緣概率分布

輸入:

FACTORS.INPUT(1) = struct(‘var‘, [1], ‘card‘, [2], ‘val‘, [0.11, 0.89]);

FACTORS.INPUT(2) = struct(‘var‘, [2, 1], ‘card‘, [2, 2], ‘val‘, [0.59, 0.41, 0.22, 0.78]);

FACTORS.INPUT(3) = struct(‘var‘, [3, 2], ‘card‘, [2, 2], ‘val‘, [0.39, 0.61, 0.06, 0.94]);

FACTORS.MARGINAL = ComputeMarginal([2, 3], FACTORS.INPUT, [1, 2]);

期望輸出:

FACTORS.MARGINAL = struct(‘var‘, [2, 3], ‘card‘, [2, 2], ‘val‘, [0.0858, 0.0468, 0.1342, 0.7332]);

相比之前計算因子的邊緣分布,這裏主要多了歸一化的要求,同時還要註意合並相同變量的問題。

參考代碼如下:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% YOUR CODE HERE:

% M should be a factor

% Remember to renormalize the entries of M!

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Joint = ComputeJointDistribution(F);

M = FactorMarginalization(ObserveEvidence(Joint, E), setdiff(Joint.var, V));

M.val = M.val ./ sum(M.val);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

var:表示變量的名稱及它們之間的關系,這裏(‘var‘, [3,1])表示因子描述的是phenotypeVar = 3的性狀對genotypeVar = 1的基因型的條件概率分布。

card:是基數(cardinalities)的縮寫,描述了元素間一一對應的集合(可能的)

Coursera概率圖模型(Probabilistic Graphical Models)第一周編程作業分析