tensorflow example 入門例子(線型迴歸與邏輯迴歸)

阿新 • • 發佈：2018-11-10

1. 前言–線性迴歸與邏輯迴歸介紹

tensorflow一般入門都至少會講兩種例子，一個是線型迴歸，一個是邏輯迴歸。（或者也可以說，迴歸演算法 & 分類演算法）
線性迴歸用來做迴歸預測，邏輯迴歸用於做二分類，一個是解決迴歸問題，一個用於解決分類問題。兩者區別：

擬合函式不同：

線性迴歸： $f (x) = θ^{T} X = θ_{1} x_{1} + θ_{2} x_{2} + \dots + θ_{n} x_{n}$

f (x) = θ^{T} X = θ_{1}

x 1 + θ 2 x 2 + ⋯ + θ

n x n $f(x) = θ^TX=θ_1x_1+θ_2x_2+⋯+θ_nx_n$
邏輯迴歸：

f (x) = p (y = 1 ∣ x; θ) = g (θ^{T} X) ， 其 中 ， g (z) = \frac{1}{1 + e^{- z}}

$f(x) = p(y=1∣x;θ) = g(θ^TX)，其中，g(z)=\frac{1}{1+e^{−z}}$ 也就是第二個例子提的sigmod函式。

取值範圍不同：
線性迴歸的樣本的輸出，都是連續值， $y∈(+∞,−∞)y∈(+∞,−∞)$ 而，邏輯迴歸中 $y∈{0,1}y∈{0,1}$ ，只能取0和1。

線上性迴歸中 $θ^TX$ 為預測值的擬合函式；而在邏輯迴歸中 $θ^TX=0$ 為決策邊界(<0則y < 0.5, >0則>0.5，正負無窮，則是1或0)。

2. 線型迴歸example

模擬樣本，不去下載，免得懵逼。
樣本的輸入是x_vals，樣本的輸出是y_vals。然後，模型就是個線型函式:
y = w * x

上程式碼：

import tensorflow as tf
import numpy as np

# 樣本，輸入列表，正太分佈(Normal Destribution)，均值為1, 均方誤差為0.1, 資料量為100個
x_vals = np.random.normal(1, 0.1, 100)
# 樣本輸出列表， 100個值為10.0的列表
y_vals = np.repeat(10.0, 100)

x_data = tf.placeholder(shape=[1], dtype=tf.float32)
y_target = tf.placeholder(shape=[1], dtype= tf.float32)

A = tf.Variable(tf.random_normal(shape=[1]))

# 我們定義的模型，是一個線型函式，即 y = w * x， 也就是my_output = A * x_data
# x_data將用樣本x_vals。我們的目標是，算出A的值。
# 其實已經能猜出，y都是10.0的話，x均值為1, 那麼A應該是10。哈哈
my_output = tf.multiply(x_data, A)

# 損失函式， 用的是模型算的值，減去實際值， 的平方。y_target就是上面的y_vals。
loss = tf.square(my_output - y_target)

sess = tf.Session()
init = tf.global_variables_initializer()#初始化變數
sess.run(init)

# 梯度下降演算法， 學習率0.02, 可以認為每次迭代修改A，修改一次0.02。比如A初始化為20, 發現不好，於是猜測下一個A為20-0.02
my_opt = tf.train.GradientDescentOptimizer(learning_rate=0.02)
train_step = my_opt.minimize(loss)#目標，使得損失函式達到最小值

for i in range(100):#0到100,不包括100
    # 隨機從樣本中取值
    rand_index = np.random.choice(100)
    rand_x = [x_vals[rand_index]]
    rand_y = [y_vals[rand_index]]
    #損失函式引用的placeholder(直接或間接用的都算), x_data使用樣本rand_x， y_target用樣本rand_y
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
    #列印
    if i%5==0:
        print('step: ' + str(i) + ' A = ' + str(sess.run(A)))
        print('loss: ' + str(sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y})))

輸出結果：

step: 0 A = [-0.29324722]
loss: [107.103676]
step: 5 A = [1.6392573]
loss: [69.70741]
step: 10 A = [3.1867485]
loss: [43.80286]
step: 15 A = [4.4426436]
loss: [32.31665]
step: 20 A = [5.454427]
loss: [28.393408]
step: 25 A = [6.2705126]
loss: [16.252668]
step: 30 A = [6.9050083]
loss: [6.105043]
step: 35 A = [7.4409676]
loss: [8.232471]
step: 40 A = [7.9324813]
loss: [6.8031826]
step: 45 A = [8.33953]
loss: [2.0388503]
step: 50 A = [8.59281]
loss: [0.87392443]
step: 55 A = [8.817221]
loss: [1.0634136]
step: 60 A = [9.096114]
loss: [3.1473236]
step: 65 A = [9.231487]
loss: [1.9277898]
step: 70 A = [9.415066]
loss: [0.22827132]
step: 75 A = [9.4759245]
loss: [0.3650696]
step: 80 A = [9.474044]
loss: [3.6430228]
step: 85 A = [9.543233]
loss: [0.00908985]
step: 90 A = [9.6931]
loss: [0.03487607]
step: 95 A = [9.785975]
loss: [0.1980833]

3. 邏輯迴歸example

也就是分類器的例子。
tensorflow很多資料來源，都是從網上獲取。比如經典的iris資料集。
鳶(拼音yuan)尾花的英文名就是iris。Iris資料集就是鳶尾花卉資料集，是一類多重變數分析的資料集。通過花萼長度，花萼寬度，花瓣長度，花瓣寬度4個屬性預測鳶尾花卉屬於（Setosa，Versicolour，Virginica）三個種類中的哪一類。

來個花的樣子，形象一點。^^
這裡寫圖片描述
這是資料集的詳細資訊。

以下是獲取該資料的程式碼。


#sklearn是機器學習套件，有很多資料集
#安裝：pip install -U scikit-learn
#      sklearn依賴python>=2.7, numpy(python擅長陣列處理的數學庫), scipy(python演算法庫和資料工具包)
from sklearn import datasets
iris = datasets.load_iris()
print('sample feature: feature_names: ' + str(iris.feature_names) + " data length: " + str(len(iris.data)))
print('sample target: target_names: ' + str(iris.target_names) + " target length: " + str(len(iris.target)))
#樣本資料，一個150x4的二維列表
print(iris.data)
#樣本標籤，一個長度為150的一維列表
print(iris.target)

結果：

sample feature: feature_names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] data length: 150
sample target: feature_names: ['setosa' 'versicolor' 'virginica'] target length: 150
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 #....省略
 [6.  3.  4.8 1.8]
 [6.9 3.1 5.4 2.1]
 [6.7 3.1 5.6 2.4]
 [6.9 3.1 5.1 2.3]
 [5.8 2.7 5.1 1.9]
 [6.8 3.2 5.9 2.3]
 [6.7 3.3 5.7 2.5]
 [6.7 3.  5.2 2.3]
 [6.3 2.5 5.  1.9]
 [6.5 3.  5.2 2. ]
 [6.2 3.4 5.4 2.3]
 [5.9 3.  5.1 1.8]]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]

因我們只要實現一個簡單的二值分類器來預測一朵花是否為Setosa。所以資料需要稍微做一下轉換。

程式碼如下：

#import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
#sklearn是機器學習套件，有很多資料集
#安裝：pip install -U scikit-learn
#      sklearn依賴python>=2.7, numpy(python擅長陣列處理的數學庫), scipy(python演算法庫和資料工具包)
from sklearn import datasets



iris = datasets.load_iris()
print('sample feature: feature_names: ' + str(iris.feature_names) + " data length: " + str(len(iris.data)))
print('sample target: target_names: ' + str(iris.target_names) + " target length: " + str(len(iris.target)))
#樣本資料，一個150x4的二維列表
#print(iris.data)
#樣本標籤，一個長度為150的一維列表
#print(iris.target)


#抽取的樣本標籤, 只要第一種，是第一種，則為1，否則為0
temp = []
for x in iris.target:
    temp.append(1 if x== 0 else 0)
binary_target = np.array(temp)#列表轉陣列，以上幾行，也可以寫成：binary_target = np.array([1 if x== 0 else 0 for x in iris.target])
print('binary_target: ')
print(binary_target)

#抽取的樣本輸入，只用兩個引數，也就是花瓣長度和寬度
iris_2d = np.array([[x[2], x[3]] for x in iris.data])
print('iris_2d: ')
print(iris_2d)

#批量訓練大小為20
batch_size = 20
x1_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
x2_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

A = tf.Variable(tf.random_normal(shape=[1,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))

#定義模型
my_mult = tf.matmul(x2_data, A)
my_add = tf.add(my_mult, b)
my_output = tf.subtract(x1_data, my_add)

#損失函式
xentropy = tf.nn.sigmoid_cross_entropy_with_logits(labels=my_output, logits=y_target)
my_opt = tf.train.GradientDescentOptimizer(0.05)
train_step = my_opt.minimize(xentropy)

#初始化變數
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

#開始迭代，更新模型，也就是計算出A和b
for i in range(1000):
    #從np.arange(len(iris_2d))生成大小為20的均勻隨機樣本,如：[ 66  42  96 115  45 127  31  70 148  57  60 127  56  96   7  63  75 127 110 144]
    rand_index = np.random.choice(len(iris_2d), size=batch_size)
    print('rand_index ' + str(rand_index))
    #rand_x為20x2的陣列，類似醬紫 [[4.5 1.5] 。。。。[1.3 0.2]]
    rand_x = iris_2d[rand_index]
    #print(' rand_x ' + str(rand_x))
    print(' rand_x shape: ' + str(rand_x.shape))

    rand_x1 = np.array([[x[0]] for x in rand_x])
    rand_x2 = np.array([[x[1]] for x in rand_x])
    #print(' rand_x1 ' + str(rand_x1))
    #print(' rand_x2 ' + str(rand_x2))

    #rand_y如果直接使用binary_target[rand_index]，則得到的是[0 0 0 0 0 0 1 0 0 0 0 1 1 1 0 1 1 0 0 0]，是一維的, shape是（20,）也就是一維陣列，陣列有20個元素
    #但是我們想要多維陣列，也就是shape為(20, 1)，因為placeholder也是這樣的維度
    #[[y] for y in binary_target[rand_index]] 得到的是[[0], [0], [0], [0], [0], [0], [1], [0], [1], [0], [0], [0], [0], [1], [1], [1], [1], [1], [0], [0]]
    #然後，轉化為陣列，維度是(20, 1)
    rand_y = np.array([[y] for y in binary_target[rand_index]])
    print('rand_y shape ' + str(rand_y.shape))
    print('rand_y  ' + str(rand_y))

    sess.run(train_step, feed_dict={x1_data: rand_x1, x2_data: rand_x2, y_target: rand_y})
    if(i+1)%200==0:
        print('step: ' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b)))

總結

生成樣本資料
初始化佔位符（用於喂樣本資料）和變數（用於迭代自變數A，也就是我們要的模型資料）
建立損失函式
定義一個優化器演算法
通過隨機樣本資料進行迭代，更新變數。

ps: 本文例子為了簡單容易理解，所以沒加上預測邏輯。要想看預測邏輯，請看下文分解。

tensorflow example 入門例子(線型迴歸與邏輯迴歸)

1. 前言–線性迴歸與邏輯迴歸介紹

2. 線型迴歸example

3. 邏輯迴歸example

總結

tensorflow example 入門例子(線型迴歸與邏輯迴歸)

機器學習 --- 線性迴歸與邏輯迴歸

線性迴歸與邏輯迴歸的區別

Pytorch學習筆記(三)線性迴歸與邏輯迴歸

機器學習_最小二乘法，線性迴歸與邏輯迴歸

線性迴歸與邏輯迴歸、softmax迴歸

線性迴歸與邏輯迴歸的區別與聯絡

線性迴歸與邏輯迴歸

tensorflow實現線性迴歸和邏輯迴歸

ml課程：線性迴歸、邏輯迴歸入門（含程式碼實現）

機器學習4：邏輯迴歸與線性迴歸

機器學習筆記《四》：線性迴歸，邏輯迴歸案例與重點細節問題分析

機器學習（一）邏輯迴歸與softmax迴歸及程式碼示例

Machine Learning--week3 邏輯迴歸函式(分類)、決策邊界、邏輯迴歸代價函式、多分類與(邏輯迴歸和線性迴歸的)正則化

Google TensorFlow課程程式設計筆記（6）———邏輯迴歸

最大熵與邏輯迴歸的等價性

機器學習之SVM與邏輯迴歸的聯絡和區別

機器學習案例——梯度下降與邏輯迴歸簡單例項

神經網路（一）：神經元模型與邏輯迴歸

線性迴歸與邏輯斯提回歸的區別

tensorflow example 入門例子(線型迴歸與邏輯迴歸)

1. 前言–線性迴歸與邏輯迴歸介紹

2. 線型迴歸example

3. 邏輯迴歸example

總結

相關推薦