1. 程式人生 > >[深度學習]Semantic Segmentation語義分割之UNet(2)

[深度學習]Semantic Segmentation語義分割之UNet(2)

論文全稱:《U-Net: Convolutional Networks for Biomedical Image Segmentation》

論文地址:https://arxiv.org/pdf/1505.04597v1.pdf

論文程式碼:https://github.com/jakeret/tf_unet

目錄

提出動機

綜述

網路結構

實驗結果

程式碼詳解


提出動機

首先,以往的深度學習模型大部分都是分類模型,但是很多視覺任務,特別是醫學影像的處理方面,需要的是語義分割,具體到每一個畫素上的分類。

其次,很多工沒有imagenet那樣大規模的資料集,收集的成本非常高。

最後,之前的方法太慢了,對於定位和使用影象中的上下文是一個tradeoff,最近很多方法都是利用多層features,本文也不例外。

綜述

UNet是基於全卷積網路,可以參考[深度學習]Semantic Segmentation語義分割之FCN(1),UNet的主要思想就是在常規的卷積網路後面新增連續的層,這些層的目的是上取樣。上取樣提高了output的輸出精度,但是為了更準確地定位,所以結合了上游的feature。Unet中一個比較重要的修改就是在上取樣的部分依然保留大量的特徵通道,這樣一來便能將上下文資訊傳播到更高的解析度層。所以整個Unet網路結構看上去就像一個“U”字形。與FCN一樣,網路中沒有使用全連線層,全是卷積層。

 

UNet這篇論文實現過程遇到一些challenge,包括資料太少以及粘連object的分離問題。前者使用elastic deformations彈性形變做了資料增強,這使得網路可以學習這種形變的不變性。後者作者提出了一種加權損失的方法,在這種方法中,接觸細胞之間的背景標籤的分離在損失函式中獲得了較大的權重。

網路結構

網路架構如上圖所示。它由收縮路徑(左側)和膨脹路徑(右側)組成。收縮路徑遵循卷積網路的典型架構。它包括兩個3x3卷積(unpadded)的重複應用,每個卷積後面是一個整流的線性單元(ReLU)和一個2x2 max pooling運算用於下采樣。在每個下采樣步驟中,將特徵通道的數量增加一倍。擴充套件路徑中的每一步都包含一個向上取樣的feature map,然後是一個2x2卷積(“up-convolution”),該卷積將feature channel的數量減半,與收縮路徑中相應裁剪的feature map進行連線,以及兩個3x3卷積,每個卷積之後是一個ReLU。這種裁剪是必要的,因為在每次卷積中都會丟失邊界畫素。在最後一層,使用1x1卷積將每個64分量的特徵向量對映到所需的類數。該網路共有23個卷積層。

實驗結果

程式碼詳解

github : https://github.com/zhixuhao/unet 程式碼基於keras。

再次看回UNet的網路結構圖,可以看到右下角有關於箭頭的標記。

橫向移動的藍色箭頭代表了3*3的conv+ReLU,例如最開始的部分如下面的定義

conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)

紅色向下箭頭代表了max pool 2*2,如

pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

綠色向上箭頭代表了up-conv 2*2,如

up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))

灰色橫向箭頭代表了與上游的feature拼接,注意是拼接不是相加。如

merge6 = concatenate([drop4,up6], axis = 3)

最後右上角的那個箭頭是一個普通的1*1 conv,後接一個sigmoid

conv10 = Conv2D(1, 1, activation = 'sigmoid')(conv9)

完整的Unet定義:

import numpy as np 
import os
import skimage.io as io
import skimage.transform as trans
import numpy as np
from keras.models import *
from keras.layers import *
from keras.optimizers import *
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras import backend as keras


def unet(pretrained_weights = None,input_size = (256,256,1)):
    inputs = Input(input_size)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
    drop4 = Dropout(0.5)(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)

    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
    drop5 = Dropout(0.5)(conv5)

    up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
    merge6 = concatenate([drop4,up6], axis = 3)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6)

    up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
    merge7 = concatenate([conv3,up7], axis = 3)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7)

    up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
    merge8 = concatenate([conv2,up8], axis = 3)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8)

    up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
    merge9 = concatenate([conv1,up9], axis = 3)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv9 = Conv2D(2, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv10 = Conv2D(1, 1, activation = 'sigmoid')(conv9)

    model = Model(input = inputs, output = conv10)

    model.compile(optimizer = Adam(lr = 1e-4), loss = 'binary_crossentropy', metrics = ['accuracy'])
    
    #model.summary()

    if(pretrained_weights):
    	model.load_weights(pretrained_weights)

    return model