10月26日，AlexNet學習彙總

阿新 • • 發佈：2018-11-16

AlexNet網路的實現

關於AlexNet的一些介紹，https://www.cnblogs.com/gongxijun/p/6027747.html

裡面有一些關於維度的計算，我還是沒弄太清楚

這裡還有一些更加詳細了，可以深入的理解看一下。

https://www.cnblogs.com/alexanderkun/p/6917984.html

https://www.cnblogs.com/alexanderkun/p/6917985.html

https://www.cnblogs.com/alexanderkun/p/6918045.html

這是一個實踐的框架，https://blog.csdn.net/MargretWG/article/details/70491745?locationNum=1&fps=1

是基於這個外國人的文章寫的https://kratzert.github.io/2017/02/24/finetuning-alexnet-with-tensorflow.html

這裡還有這樣的一篇文章https://blog.csdn.net/btbujhj/article/details/73302970

裡面講述的內容也很詳細，還有一些演算法案例

、

1，init的用法

（知乎搬運）

定義類的時候，若是新增__init__方法，那麼在建立類的例項的時候，例項會自動呼叫這個方法，一般用來對例項的屬性進行初使化。比如：
class testClass:
def __init__(self, name, gender): //定義 __init__方法，這裡有三個引數，這個self指的是一會建立類的例項的時候這個被建立的例項本身（例中的testman），你也可以寫成其他的東西，比如寫成me也是可以的，這樣的話下面的self.Name就要寫成me.Name。
self.Name=name //通常會寫成self.name=name，這裡為了區分前後兩個是不同的東東，把前面那個大寫了，等號左邊的那個Name（或name）是例項的屬性，後面那個是方法__init__的引數，兩個是不同的）
self.Gender=gender //通常會寫成self.gender=gender
print('hello') //這個print('hello')是為了說明在建立類的例項的時候，__init__方法就立馬被呼叫了。

testman = testClass('neo,'male') //這裡建立了類testClass的一個例項 testman, 類中有__init__這個方法，在建立類的例項的時候，就必須要有和方法__init__匹配的引數了，由於self指的就是建立的例項本身，self是不用傳入的，所以這裡傳入兩個引數。這條語句一出來，例項testman的兩個屬性Name，Gender就被賦值初使化了，其中Name是 neo，Gender 是male。

1.5 lambda函式的用法

re, 看懂程式碼應該就明白是什麼意思了

# -*- coding: utf-8 -*-
# 匿名函式lambda
def sum(x,y):
    return x+y
print('common use',sum(3,5))

p = lambda x,y:x+y
# 匿名函式沒有返回值，自己本身就是返回值
print('lambda uses',p(7,8))

common use 8
lambda uses 15

2，dropout & dropout rate

ropout是hintion最近2年提出的；為了防止模型過擬合，Dropout可以作為一種trikc供選擇。在hinton的論文摘要中指出，在每個訓練批次中，通過忽略一半的特徵檢測器（讓一半的隱層節點值為0），可以明顯地減少過擬合現象。這種方式可以減少特徵檢測器間的相互作用，檢測器相互作用是指某些檢測器依賴其他檢測器才能發揮作用。

https://www.jianshu.com/p/b5e93fa01385

https://blog.csdn.net/stdcoutzyx/article/details/49022443

https://www.cnblogs.com/zyber/p/6824980.html

3. input_channels = int(x.get_shape()[-1])

get_shape函式（）

函式主要用於獲取一個張量的維度，並且輸出張量每個維度上面的值，如果是二維矩陣，也就是輸出行和列的值，使用非常方便。

import tensorflow as tf;  
 
with tf.Session() as sess:
	A = tf.random_normal(shape=[3,4])


	print A.get_shape()
	print A.get_shape


輸出：
(3, 4)
<bound method Tensor.get_shape of <tf.Tensor 'random_normal:0' shape=(3, 4) dtype=float32>>

注意：第一個輸出是一個元祖，就是數值，而第二輸出就是一個張量的物件，裡面包含更多的東西，在不同的情況下，使用不同的方式。如果你需要輸出某一個維度上面的值那就用下面的這種方式就好了。

A.get_shape()[0]

這就表示第一個維度。

4， lambda函式

( 也就是匿名函式！！！ )

在python中有一個匿名函式lambda，匿名函式顧名思義就是指：是指一類無需定義識別符號（函式名）的函式或子程式。

# -*- coding: UTF-8 -*-
f = lambda x,y,z:x + y + z

print f(1,2,3)
print f(4,5,6)

輸出：
6
15

使用lambda函式應該注意的幾點：

lambda定義的是單行函式，如果需要複雜的函式，應該定義普通函式
lambda引數列表可以包含多個引數，如 lambda x, y: x + y
lambda中的表示式不能含有命令，而且只限一條表示式

5， with tf.variable_scope(name) as scope: 的用法

tf.variable_scope() 主要結合 tf.get_variable() 來使用，實現變數共享。

'''
Signature: tf.name_scope(*args, **kwds)
Docstring:
Returns a context manager for use when defining a Python op.
'''

這兩篇部落格說的已經非常詳細了

https://www.cnblogs.com/adong7639/p/8136273.html

tf.get_variable 和tf.variable_scope
https://www.aliyun.com/jiaocheng/519743.html

6， weights = tf.get_variable

就是一個變數建立的函式：

tf.get_variable(name, shape, initializer): name就是變數的名稱，shape是變數的維度，initializer是變數初始化的方式

使用tf.Variable時，如果檢測到命名衝突，系統會自己處理。使用tf.get_variable()時，系統不會處理衝突，而會報錯

基於這兩個函式的特性，當我們需要共享變數的時候，需要使用tf.get_variable()。在其他情況下，這兩個的用法是一樣的

tensorflow中有兩個關於variable的op，tf.Variable()與tf.get_variable()下面介紹這兩個的區別

https://blog.csdn.net/u012436149/article/details/53696970

7, tf.split & tf.concat

tf.split( value, num_or_size_splits, axis=0, num=None, name='split' )

這個函式是用來切割張量的。輸入切割的張量和引數，返回切割的結果。
value傳入的就是需要切割的張量。
這個函式有兩種切割的方式：

以三個維度的張量為例，比如說一個20 * 30 * 40的張量my_tensor，就如同一個長20釐米寬30釐米高40釐米的蛋糕，每立方厘米都是一個分量。

有兩種切割方式：
1. 如果num_or_size_splits傳入的是一個整數，這個整數代表這個張量最後會被切成幾個小張量。此時，傳入axis的數值就代表切割哪個維度（從0開始計數）。呼叫tf.split(my_tensor, 2，0)返回兩個10 * 30 * 40的小張量。
2. 如果num_or_size_splits傳入的是一個向量，那麼向量有幾個分量就分成幾份，切割的維度還是由axis決定。比如呼叫tf.split(my_tensor, [10, 5, 25], 2)，則返回三個張量分別大小為 20 * 30 * 10、20 * 30 * 5、20 * 30 * 25。很顯然，傳入的這個向量各個分量加和必須等於axis所指示原張量維度的大小 (10 + 5 + 25 = 40)。

tf.concat( )和tf.stack( )

https://www.cnblogs.com/mdumpling/p/8053474.html

8, tf.nn.xw_plus_b

tf.nn.xw_plus_b((x, weights) + biases)

相當於tf.matmul(x, weights) + biases

#-*-coding:utf8-*-  
import tensorflow as tf  
x=[[1, 2, 3],[4, 5, 6]]  
w=[[ 7,  8],[ 9, 10],[11, 12]]  
b=[[3,3],[3,3]]  
result1=tf.nn.xw_plus_b(x,w,[3,3])  
result2=tf.matmul(x, w) + b
init_op = tf.initialize_all_variables()  

with tf.Session() as sess:  
    # Run the init operation.  
    sess.run(init_op)  
    print(sess.run(result1))  
    print(sess.run(result2))

結果為

[[ 61  67]
 [142 157]]
[[ 61  67]
 [142 157]]

除錯部分的一些函式*******

Part 1

import os
import cv2
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt


#  mean of imagenet dataset in BGR
imagenet_mean = np.array([104., 117., 124.], dtype=np.float32)

#  add path of testImages
current_dir = os.getcwd()
image_dir = os.path.join(current_dir, 'images')

#  通過這部分程式碼便可以找到我們資料集的位置了  #

# 這裡opencv的輸入時BGR格式，並不是我們所熟悉的RGB格式，所以要注意在最後進行轉換

os.getcwd（）

在Python中可以使用os.getcwd()函式獲得當前的路徑。

其原型如下所示：

os.getcwd()

該函式不需要傳遞引數，它返回當前的目錄。需要說明的是，當前目錄並不是指指令碼所在的目錄，而是所執行指令碼的目錄。

os.path.join （）

os.path.join(os.getcwd(),'data')就是獲取當前目錄，並組合成新目錄

以下部分摘自https://www.cnblogs.com/donfaquir/p/9042673.html

在使用的過程中，我使用如下程式碼：

import os
path = "F:/gts/gtsdate/"
b = os.path.join(path,"/abc")

輸出結果是：

'F:/abc'

並不是我期望的：

"F:/gts/gtsdate/abc"

原因是在os.path.join()第二個引數"/abc"起始字元是/。
刪除該字元即可，也就是

b = os.path.join(path,"abc")

os.path常用方法介紹http://www.cnblogs.com/wuxie1989/p/5623435.html

Part 2

#get list of all images
img_files = [os.path.join(image_dir, f) for f in os.listdir(image_dir) if f.endswith('.jpeg')]

#load all images
imgs = []
for f in img_files:
    imgs.append(cv2.imread(f))
    
#plot images
fig = plt.figure(figsize=(15,6))
for i, img in enumerate(imgs):
    fig.add_subplot(1,3,i+1)
    plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    plt.axis('off')

#  到這裡我們就將我們dataset中的圖片讀取了進來  #

img_files = [os.path.join(image_dir, f) for f in os.listdir(image_dir) if f.endswith('.jpeg')]

熟悉一下這種寫法

首先是路徑的新增，使用os,path.join()函式，將f 新增到路徑 image_dir的後面構成我們的 img_files

而 f 的生成則依賴一個for迴圈和一個if判斷語句

f 是image_dir 這個資料夾中，以.jpeg為結尾的檔案，也就是影象檔案。（這裡要注意，如果你強行將檔名更改為.jpeg是不能夠讀取到的，必須要詳細的檢視檔案的屬性才可以）

os.listdir() 方法用於返回指定的資料夾包含的檔案或資料夾的名字的列表。

這個列表以字母順序。它不包括 '.' 和'..' 即使它在資料夾中。listdir()方法語法格式如下：

os.listdir(path)--------（path -- 需要列出的目錄路徑）

返回為指定路徑下的檔案和資料夾列表。

for i, img in enumerate(imgs):

i 儲存影象的編號，img則儲存影象資訊

enumerate() 函式用於將一個可遍歷的資料物件(如列表、元組或字串)組合為一個索引序列，同時列出資料和資料下標，一般用在 for 迴圈當中。

seq = ['one', 'two', 'three']
 for i, element in enumerate(seq):
    print i, element

結果： 
        0   one
        1   two
        2   three

plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))

我們生活中大多數看到的彩色圖片都是RGB型別，但是在進行影象處理時，需要用到灰度圖、二值圖、HSV、HSI等顏色制式，opencv提供了cvtColor()函式來實現這些功能。首先看一下cvtColor函式定義：

cvtColor(InputArray src, OutputArray dst, int code, int dstCn=0 );

具體內容參見https://blog.csdn.net/keith_bb/article/details/53470170

Part 3

from alexnet import AlexNet
from caffe_classes import class_names

#placeholder for input and dropout rate
x = tf.placeholder(tf.float32, [1, 227, 227, 3])
keep_prob = tf.placeholder(tf.float32)


#create model with default config ( == no skip_layer and 1000 units in the last layer)
#  哈哈， 這最關鍵的一部居然如此的輕描淡寫
model = AlexNet(x, keep_prob, 1000, [])


#define activation of last layer as score
score = model.fc8


#create op to calculate softmax 
softmax = tf.nn.softmax(score)

def __init__(self, x, keep_prob, num_classes, skip_layer, weights_path = 'DEFAULT'):

model = AlexNet ( x, keep_prob, 1000, [ ] )

我們根據__init__函式來看這個 model

self.X = x
self.KEEP_PROB = keep_prob
self.NUM_CLASSES = 1000
self.SKIP_LAYER = 空
weights_path == 'DEFAULT'

score = model.fc8 ### 將最後一層的activation 作為score

同樣的，我們還是對比來看：

def fc ( x, num_in, num_out, name, relu = True ) :

self.fc8 = fc ( dropout7, 4096, self.NUM_CLASSES, relu = False, name='fc8' )

Part 4

with tf.Session() as sess:
    
    # Initialize all variables
    sess.run(tf.global_variables_initializer())
    
    # Load the pretrained weights into the model
    model.load_initial_weights(sess)
    
    # Create figure handle
    fig2 = plt.figure(figsize=(15,6))
    
    # Loop over all images
    for i, image in enumerate(imgs):
        
        # Convert image to float32 and resize to (227x227)
        img = cv2.resize(image.astype(np.float32), (227,227))
        
        # Subtract the ImageNet mean
        img -= imagenet_mean
        
        # Reshape as needed to feed into model
        img = img.reshape((1,227,227,3))
        
        # Run the session and calculate the class probability
        probs = sess.run(softmax, feed_dict={x: img, keep_prob: 1})
        
        # Get the class name of the class with the highest probability
        class_name = class_names[np.argmax(probs)]
        
        # Plot image with class name and prob in the title
        fig2.add_subplot(1,3,i+1)
        plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
        plt.title("Class: " + class_name + ", probability: %.4f" %probs[0,np.argmax(probs)])
        plt.axis('off')

model.load_initial_weights(sess)

這裡呼叫了例項model中的 load_initial_weights 函式，我們看一下他的原型

def load_initial_weights(self, session):

************************

# Convert image to float32 and resize to (227x227)
img = cv2.resize(image.astype(np.float32), (227,227))

astype（）的作用：

很多時候我們用numpy從文字檔案讀取資料作為numpy的陣列，預設的dtype是float64

但是有些場合我們希望有些資料列作為整數, 如果直接改dtype='int'的話，就會出錯！原因如上，陣列長度翻倍了！！！

怎麼辦？用astype！

>>> b = np.array([1.23,12.201,123.1])
>>>
>>> b
array([   1.23 ,   12.201,  123.1  ])
>>> b.dtype
dtype('float64')
>>> c = b.astype(int)
>>> c
array([  1,  12, 123])
>>> c.dtype
dtype('int32')