1. 程式人生 > >閱讀筆記之——《Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform》

閱讀筆記之——《Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform》

本博文是文章《Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform》也就是SFTGAN的學習筆記。附上論文的連線(https://arxiv.org/pdf/1804.02815.pdf)本博文屬於本人閱讀該論文時寫下的閱讀筆記,思路按本人閱讀的跳躍式思路,僅供本人理解用。

除了對STFGAN進行描述以外,本博文還對幾種SR的loss進行了分析理解

 

Recovering Realistic Texture恢復真實的紋理,那應該就是如NIQE一樣,評價圖片更加的sharp,而不是傳統的超分方法關注PSNR(希望生成的圖片more realistic and visually pleasing textures)。而Spatial Feature Transform(就是特徵空間的轉換,應該也就是論文的中心,如何通過特徵空間的轉換來實現紋理中心的恢復)

In this paper, we show that it is possible to recover textures faithful to semantic classes.基於語義類來恢復紋理,是基於特定的區域恢復紋理的意思嗎?接下來看一下

we only need to modulate features of a few intermediate layers in a single network conditioned on semantic segmentation probability maps.(我們只需要在語義分割概率圖的條件下調整單個網路中幾個中間層的特徵。)通過空間特徵變換層(SFT)生成用於空間特徵調製的仿射變換引數。

傳統的SR方法,是基於MSE。屬於基於畫素緯度的loss,會導致產生的圖片更加的模糊、過平滑(conventional pixel-wise mean squared error (MSE) loss that tends to encourage blurry and overly-smoothed results。encourage the network to find an average of many plausible solutions and lead to overly-smooth results.)而SRGAN提出的perceptual loss對特徵維度進行優化而不是畫素維度(optimize a superresolution model in a feature space instead of pixel space。are proposed to enhance the visual quality by minimizing the error in a feature space.)進一步地,通過引入adversarial loss(generating images with more natural details),使得生成更加自然的圖片。(區域性紋理匹配損失)local texture matching loss, partly reducing visually unpleasant artifacts.

紋理的恢復(texture recovery)方面的問題。如下圖所示。不同的HR patches可能有相似的LR。圖中的without prior是perceptual and adversarial losses得到的。因此需要產生更強的紋理特性。因此作者通過在特定的植物資料集和建築物資料集產生的兩個CNN網路,來恢復出更加好的紋理細節。為每個語義類別訓練專用模型(train specialized models for each semantic category)可以通過對語義的先驗來改變SR的結果

為此,本文是基於類的先驗的超分(investigate class-conditional image super-resolution with CNN)當一副影象中存在多個語義類時也要實現SR。SFT通過轉換網路的某些中間層的特徵,能夠改變SR網路的效能。SFT層以語義分割概率圖為條件,基於此,它生成一對調製引數,以在空間上對網路的特徵圖應用仿射變換。通過轉換單個網路的中間特徵,只需一次正向傳遞就可以實現具有豐富語義區域的HR影象的重建。本文采用local texture matching loss,基於categorical priors(分類先驗)

assume multiple categorical classes to co-exist in an image, and propose an effective layer that enables an SR network to generate rich
and realistic textures in a single forward pass conditioned on the prior provided up to the pixel level.

論文中提到“Conditional Normalization (CN) applies a learned function of some conditions to replace parameters for feature-wise affine transformation in BN.”也就是說,所謂CN層就是用某些條件下,學習到的函式,來替換BN層中的特徵仿射變換引數。

SFT層能夠轉換空間條件,不僅可以進行特徵操作,還可以進行空間轉換(It is capable of converting spatial conditions for not only feature-wise manipulation but also spatial-wise transformation)

在語義分割方面。本文使用語義對映來指導SR域中不同區域的紋理恢復。 其次,利用概率圖來捕捉精細的紋理區別,而不是簡單的影象片段。

SFT層學習mapping function,然後輸出基於一些先驗條件的調製引數對學習的引數對通過在空間上對SR網路中的每個中間特徵圖應用仿射變換來自適應地影響輸出(先驗由一對仿射變換引數建模)。

SFTGAN的網路結構如下圖所示。包括了a condition network and an SR network。condition network將分割概率圖作為輸入,然後通過四個卷積層進行處理,生成所有SFT層共享的中間條件。

 

補充

Perceptual Loss

 

 

參考文獻

J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.

J. Bruna, P. Sprechmann, and Y. LeCun. Super-resolution with deep convolutional sufficient statistics. In ICLR, 2015.

 

 

 


Adversarial Loss

C. Ledig, L. Theis, F. Husz´ar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. In CVPR, 2017. 

M. S. Sajjadi, B. Sch¨olkopf, and M. Hirsch. EnhanceNet: Single image super-resolution through automated texture synthesis. In ICCV, 2017.

 

 

 

 

Local Texture Matching Loss

M. S. Sajjadi, B. Sch¨olkopf, and M. Hirsch. EnhanceNet: Single image super-resolution through automated texture synthesis. In ICCV, 2017.