1. 程式人生 > >MXnet程式碼實戰之多類邏輯迴歸

MXnet程式碼實戰之多類邏輯迴歸

多類邏輯迴歸

在談多類邏輯迴歸之前,我們先要認識邏輯迴歸。邏輯迴歸(Logistic Regression)是機器學習中的一種分類模型,雖然它的名字有個迴歸,其實它是做分類的。說簡單點,就是線上性迴歸的輸出加入了sigmoid 函式,使得結果輸出變成了二分類。而多類邏輯迴歸就是在輸出加入了softmax函式,類別數由自己模型定義。

如下圖,黃色的節點依舊為輸出特徵,綠色的節點為輸出的類別,多類邏輯迴歸就是在綠點的輸出基礎上加了一個softmax函式進行概率歸一化:
這裡寫圖片描述

從0開始學習實現多類邏輯迴歸

程式碼:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#Author: yuquanle #2017/10/14 #沐神教程實戰之多分類邏輯迴歸 #本例子使用一個類似MNIST的資料集做分類,MNIST是分類數字,這個資料集分類服飾 from mxnet import gluon from mxnet import ndarray as nd def transform(data, label): return data.astype('float32')/255, label.astype('float32') mnist_train = gluon.data.vision.FashionMNIST(train=True, transform=transform) mnist_test = gluon.data.vision.FashionMNIST(train=False
, transform=transform) # 標籤對應的服飾名字 def get_text_labels(label): text_labels = [ 't-shirt', 'trouser', 'pullover', 'dress,', 'coat', 'sandal', 'shirt', 'sneaker', 'bag', 'ankle boot' ] return [text_labels[int(i)] for i in label] # 資料讀取 batch_size = 256 # gluon.data的DataLoader 函式,它每次 yield ⼀個批量
train_data = gluon.data.DataLoader(mnist_train, batch_size, shuffle=True) test_data = gluon.data.DataLoader(mnist_test, batch_size, shuffle=False) #初始化引數 num_inputs = 784 num_outputs = 10 W = nd.random_normal(shape=(num_inputs, num_outputs)) b = nd.random_normal(shape=num_outputs) params = [W, b] for param in params: param.attach_grad() # 定義模型 # 多分類中,輸出為每個類別的概率,這些概率和為1,通過softmax函式實現 from mxnet import nd def softmax(X): exp = nd.exp(X) partition = exp.sum(axis=1, keepdims=True) return exp / partition def net(X): return softmax(nd.dot(X.reshape((-1, num_inputs)), W) + b) # 交叉熵損失函式 # 我們需要定義⼀個針對預測為概率值的損失函式。其中最常⻅的是交叉熵損失函式,它將兩個概率 # 分佈的負交叉熵作為⽬標值,最小化這個值等價於最⼤化這兩個概率的相似度。 def corss_entropy(yhat, y): return - nd.pick(nd.log(yhat), y) # 計算精度 # 給定⼀個概率輸出,我們將預測概率最⾼的那個類作為預測的類,然後通過⽐較真實標號得到是否預測正確 def accuracy(output, label): return nd.mean(output.argmax(axis=1)==label).asscalar() def evaluate_accuracy(data_iterator, net): acc = 0 for data, label in data_iterator: output = net(data) # acc_tmp = accuracy(output, label) acc = acc + accuracy(output, label) return acc/len(data_iterator) # print(evaluate_accuracy(test_data, net)) # # import sys # sys.path.append('..') from utils import SGD from mxnet import autograd learning_rate = 0.1 epochs = 5 for epoch in range(epochs): train_loss = 0 train_acc = 0 for data, label in train_data: with autograd.record(): output = net(data) loss = corss_entropy(output, label) loss.backward() # 將梯度做平均,這樣學習率會對 batch size 不那麼敏感 SGD(params, learning_rate / batch_size) train_loss = train_loss + nd.mean(loss).asscalar() train_acc += accuracy(output, label) # 模型訓練完之後進行測試 test_acc = evaluate_accuracy(test_data, net) print("Epoch %d. Loss: %f, Train acc %f, Test acc %f" % ( epoch, train_loss / len(train_data), train_acc / len(train_data), test_acc)) # 對新的樣本進行標籤預測 # 訓練完之後,W,b引數已經固定,輸入data,得到label就是預測過程 data, label = mnist_test[0:9] print('true labels') print(get_text_labels(label)) predicted_labels = net(data).argmax(axis=1) print('predicted labels') print(get_text_labels(predicted_labels.asnumpy())) #結果 Epoch 0. Loss: 3.614154, Train acc 0.441933, Test acc 0.596094 Epoch 1. Loss: 1.931394, Train acc 0.625044, Test acc 0.651074 Epoch 2. Loss: 1.598601, Train acc 0.673343, Test acc 0.694531 Epoch 3. Loss: 1.420518, Train acc 0.701335, Test acc 0.711719 Epoch 4. Loss: 1.308131, Train acc 0.718661, Test acc 0.726855 true labels ['t-shirt', 'trouser', 'pullover', 'pullover', 'dress,', 'pullover', 'bag', 'shirt', 'sandal'] predicted labels ['shirt', 'trouser', 'pullover', 't-shirt', 'dress,', 'shirt', 'bag', 'coat', 'sandal'] Process finished with exit code 0

多類邏輯迴歸—使用 Gluon

程式碼:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#Author: yuquanle
#2017/10/15
#沐神教程實戰之多分類邏輯迴歸
#本例子使用一個類似MNIST的資料集做分類,MNIST是分類數字,這個資料集分類服飾

# 標籤對應的服飾名字
def get_text_labels(label):
    text_labels = [
        't-shirt', 'trouser', 'pullover', 'dress,', 'coat',
        'sandal', 'shirt', 'sneaker', 'bag', 'ankle boot'
    ]
    return [text_labels[int(i)] for i in label]

# 使用mxnet高層抽象包gluon實現
from mxnet import gluon
from mxnet import ndarray as nd


batch_size = 256

def transform(data, label):
    return data.astype('float32')/255, label.astype('float32')

mnist_train = gluon.data.vision.FashionMNIST(train=True, transform=transform)
mnist_test = gluon.data.vision.FashionMNIST(train=False, transform=transform)

train_data = gluon.data.DataLoader(mnist_train, batch_size, shuffle=True)
test_data = gluon.data.DataLoader(mnist_test, batch_size, shuffle=False)

# 定義和初始化模型
# 不需要制定每層輸⼊的⼤小, gluon 會做⾃動推導
net = gluon.nn.Sequential()
with net.name_scope():
    # 使⽤ Flatten 層將輸⼊資料轉成 batch_size x ? 的矩陣
    net.add(gluon.nn.Flatten())
    # 10個輸出節點
    net.add(gluon.nn.Dense(10))
net.initialize()

# Softmax 和交叉熵損失函式
softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()

# 優化
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.1})

# 訓練
from mxnet import ndarray as nd
from mxnet import autograd
import utils

for epoch in range(5):
    train_loss = 0.
    train_acc = 0.
    for data, label in train_data:
        with autograd.record():
            output = net(data)
            loss = softmax_cross_entropy(output, label)
        loss.backward()
        trainer.step(batch_size)

        train_loss += nd.mean(loss).asscalar()
        train_acc += utils.accuracy(output, label)

    # 訓練完模型之後,用測試集測試
    test_acc = utils.evaluate_accuracy(test_data, net)
    print("Epoch %d. Loss: %f, Train acc %f, Test acc %f" % (
        epoch, train_loss / len(train_data), train_acc / len(train_data), test_acc))

結果:
Epoch 0. Loss: 0.791282, Train acc 0.745268, Test acc 0.802637
Epoch 1. Loss: 0.575680, Train acc 0.808965, Test acc 0.820605
Epoch 2. Loss: 0.530466, Train acc 0.823908, Test acc 0.830273
Epoch 3. Loss: 0.505710, Train acc 0.830430, Test acc 0.836816
Epoch 4. Loss: 0.490304, Train acc 0.834707, Test acc 0.836816
true labels
['t-shirt', 'trouser', 'pullover', 'pullover', 'dress,', 'pullover', 'bag', 'shirt', 'sandal']
predicted labels
['t-shirt', 'trouser', 'pullover', 'shirt', 'coat', 'shirt', 'bag', 'shirt', 'sandal']

Process finished with exit code 0

實驗結果發現,迭代次數為5時,有少數類別分類出錯。