吳恩達深度學習4-Week4課後作業2-Neural Style Transfer
阿新 • • 發佈:2018-11-29
一、Deeplearning-assignment
在本節的學習中,我們將學習神經風格遷移(Neural Style Transfer)演算法,通過該演算法使得兩張不同風格的圖片融合成一張圖片。
問題描述:神經風格遷移演算法是深度學習中的一種有趣的技術。正如下面的圖片所示,演算法將兩種圖片的風格特點融合在了一起。
神經風格遷移:運用了一個預訓練的卷積神經網路,這種將一個任務的特點運用到另一個任務的想法叫做遷移學習。
建立NST的三個步驟:
計算content的損失:
計算style的損失:
style矩陣:
定義整體的損失,並優化:
建立一個函式,用來使content損失和style損失最小化:
神經風格遷移演算法總體流程:
- Create an Interactive Session
- Load the content image
- Load the style image
- Randomly initialize the image to be generated
- Load the VGG16 model
- Build the TensorFlow graph:
- Run the content image through the VGG16 model and compute the content cost
- Run the style image through the VGG16 model and compute the style cost
- Compute the total cost
- Define the optimizer and the learning rate
- Initialize the TensorFlow graph and run it for a large number of iterations, updating the generated image at every step.
程式碼執行結果:
+
+ =
+ =
二、相關演算法程式碼
import os
import sys
import scipy.io
import scipy.misc
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
from PIL import Image
from nst_utils import *
import numpy as np
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# model = load_vgg_model("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/imagenet-vgg-verydeep-19.mat")
# print(model)
# content_image = scipy.misc.imread(
# "e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/louvre.jpg")
# imshow(content_image)
# plt.show()
def compute_content_cost(a_C, a_G):
m, n_H, n_W, n_C = a_G.get_shape().as_list()
a_C_unrolled = tf.transpose(a_C)
a_G_unrolled = tf.transpose(a_G)
J_content = tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled))) / (4 * n_H * n_W * n_C)
return J_content
# tf.reset_default_graph()
# with tf.Session() as test:
# tf.set_random_seed(1)
# a_C = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
# a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
# J_content = compute_content_cost(a_C, a_G)
# print("J_content = " + str(J_content.eval()))
# style_image = scipy.misc.imread(
# "e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/monet_800600.jpg")
# imshow(style_image)
# plt.show()
def gram_matrix(A):
GA = tf.matmul(A, tf.transpose(A))
return GA
# tf.reset_default_graph()
# with tf.Session() as test:
# tf.set_random_seed(1)
# A = tf.random_normal([3, 2 * 1], mean=1, stddev=4)
# GA = gram_matrix(A)
# print("GA = " + str(GA.eval()))
def compute_layer_style_cost(a_S, a_G):
m, n_H, n_W, n_C = a_G.get_shape().as_list()
a_S = tf.reshape(a_S, [n_H * n_W, n_C])
a_G = tf.reshape(a_G, [n_H * n_W, n_C])
GS = gram_matrix(tf.transpose(a_S))
GG = gram_matrix(tf.transpose(a_G))
J_style_layer = tf.reduce_sum(tf.square(tf.subtract(GS, GG))) / (4 * tf.square(tf.to_float(n_H * n_W * n_C)))
return J_style_layer
# tf.reset_default_graph()
# with tf.Session() as test:
# tf.set_random_seed(1)
# a_S = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
# a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
# J_style_layer = compute_layer_style_cost(a_S, a_G)
# print("J_style_layer = " + str(J_style_layer.eval()))
STYLE_LAYERS = [
('conv1_1', 0.2),
('conv2_1', 0.2),
('conv3_1', 0.2),
('conv4_1', 0.2),
('conv5_1', 0.2)]
def compute_style_cost(model, STYLE_LAYERS):
J_style = 0
for layer_name, coeff in STYLE_LAYERS:
out = model[layer_name]
a_S = sess.run(out)
a_G = out
J_style_layer = compute_layer_style_cost(a_S, a_G)
J_style += coeff * J_style_layer
return J_style
def total_cost(J_content, J_style, alpha=10, beta=40):
J = alpha * J_content + beta * J_style
return J
# tf.reset_default_graph()
# with tf.Session() as test:
# np.random.seed(3)
# J_content = np.random.randn()
# J_style = np.random.randn()
# J = total_cost(J_content, J_style)
# print("J = " + str(J))
tf.reset_default_graph()
sess = tf.InteractiveSession()
# content_image = scipy.misc.imread("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/louvre_small.jpg")
content_image = scipy.misc.imread("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/cat.jpg")
content_image = reshape_and_normalize_image(content_image)
style_image = scipy.misc.imread("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/style.jpg")
style_image = reshape_and_normalize_image(style_image)
generated_image = generate_noise_image(content_image)
imshow(generated_image[0])
plt.show()
model = load_vgg_model("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/imagenet-vgg-verydeep-19.mat")
sess.run(model['input'].assign(content_image))
out = model['conv4_2']
a_C = sess.run(out)
a_G = out
J_content = compute_content_cost(a_C, a_G)
sess.run(model['input'].assign(style_image))
J_style = compute_style_cost(model, STYLE_LAYERS)
J = total_cost(J_content, J_style, 10, 40)
optimizer = tf.train.AdamOptimizer(2.0)
train_step = optimizer.minimize(J)
def model_nn(sess, input_image, num_iterations=100):
sess.run(tf.global_variables_initializer())
sess.run(model['input'].assign(input_image))
for i in range(num_iterations):
sess.run(train_step)
generated_image = sess.run(model['input'])
if i % 10 == 0:
Jt, Jc, Js = sess.run([J, J_content, J_style])
print("Iteration " + str(i) + " :")
print("total cost = " + str(Jt))
print("content cost = " + str(Jc))
print("style cost = " + str(Js))
save_image("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/output/" + str(i) + ".png", generated_image)
save_image("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/output/generatede_image.jpg", generated_image)
return generated_image
model_nn(sess, generated_image)
三、總結
學到這裡,你可以建立通過神經風格轉移演算法產生的藝術影象,這也是你第一次建立的更新畫素值的模型優化演算法,而不是隻是更新神經網路的引數。深度學習有很多不同型別的模型,這只是其中一個。
從本節內容你需要記住的是:
- Neural Style Transfer is an algorithm that given a content image C and a style image S can generate an artistic image.
- It uses representations (hidden layer activations) based on a pretrained ConvNet.
- The content cost function is computed using one hidden layer's activations.
- The style cost function for one layer is computed using the Gram matrix of that layer's activations. The overall style cost function is obtained using several hidden layers.
- Optimizing the total cost function results in synthesizing new images.