1. 程式人生 > >Two Examples of Minimum Error Pruning(reprint)

Two Examples of Minimum Error Pruning(reprint)

The first example

Expected Error Pruning
Approximate expected error assuming that we prune at a particular node.

Approximate backed-up error from children assuming we did not prune.

If expected error isless than backed-up error, prune.
Expected Error
If we prune a node, it becomes a leaf labelled, C.
What will be the expected classification error at this leaf?
E

(S)=Nn+k1N+kE(S)=\frac{N-n+k-1}{N+k}
(This is called the Laplace error estimate - it is based on the assumption that the distribution of probabilities that examples will belong to different classes is uniform.)

S is the set of examples in a node
k is the number of classes
N examples in S
C is the majority class in S
n out of N examples in S belong to C

Backed-up Error
For a non-leaf node

Let chidren of Node be Node1, Node2, etc

BackedUpError(Node)=i=0PiError(Nodei)Backed\ Up\ Error(Node)=\sum_{i=0}P_i·Error(Node_i)

Probabilities can be estimated by relative frequencies of attribute values in sets of examples that fall into child nodes.

Error(Node)=min(E(Node),BackedUpError(Node))Error(Node)=min(E(Node),Backed\ Up\ Error(Node))

Pruning
在這裡插入圖片描述

Error Calculation
Left child of b has class frequencies [3, 2]
E=Nn+k+1N+kE=\frac{N-n+k+1}{N+k}
Right child has error of 0.333, calculated in the same way

Static error estimate E(b) is 0.375, again calculated using the Laplace
error estimate formula, with N=6, n=4, and k=2.

Backed-up error is:
BackedUpError(b)=560.429+160.333=0.413Backed\ Up\ Error(b)=\frac{5}{6}·0.429+\frac{1}{6}·0.333=0.413
(5/6 and 1/6 because there are 4+2=6 examples handled by node b, of which 3+2=5 go to the left subtree and 1 to the right subtree.

Since backed-up estimate of .413 is greater than static estimate of 0.375, we prune the tree and use static the error of 0.375

MEP Pruning Algorithm is invented in
<Learning decision rules in noisy domains>
Niblett, T , Bratko, I - Conference on Expert Systems - 1986

There are two editions of MEP,the above is the earliest one ,
the other one is in
<on estimating probabilities in tree pruning>

The Second example

Reference:
《An Empirical Comparison of Pruning Methods for Decision Tree induction》
在這裡插入圖片描述
在這裡插入圖片描述
在這裡插入圖片描述
在這裡插入圖片描述