1. 程式人生 > >吳恩達深度學習4-Week4課後作業2-Neural Style Transfer

吳恩達深度學習4-Week4課後作業2-Neural Style Transfer

一、Deeplearning-assignment

在本節的學習中,我們將學習神經風格遷移(Neural Style Transfer)演算法,通過該演算法使得兩張不同風格的圖片融合成一張圖片。

問題描述:神經風格遷移演算法是深度學習中的一種有趣的技術。正如下面的圖片所示,演算法將兩種圖片的風格特點融合在了一起。

神經風格遷移:運用了一個預訓練的卷積神經網路,這種將一個任務的特點運用到另一個任務的想法叫做遷移學習。

建立NST的三個步驟:

計算content的損失:

計算style的損失:

style矩陣:

定義整體的損失,並優化:

建立一個函式,用來使content損失和style損失最小化:

神經風格遷移演算法總體流程:

  1. Create an Interactive Session
  2. Load the content image
  3. Load the style image
  4. Randomly initialize the image to be generated
  5. Load the VGG16 model
  6. Build the TensorFlow graph:
    • Run the content image through the VGG16 model and compute the content cost
    • Run the style image through the VGG16 model and compute the style cost
    • Compute the total cost
    • Define the optimizer and the learning rate
  7. Initialize the TensorFlow graph and run it for a large number of iterations, updating the generated image at every step.

程式碼執行結果:

     +     

     =     

     +          =     

     +          =      


二、相關演算法程式碼

import os
import sys
import scipy.io
import scipy.misc
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
from PIL import Image
from nst_utils import *
import numpy as np
import tensorflow as tf

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# model = load_vgg_model("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/imagenet-vgg-verydeep-19.mat")
# print(model)


# content_image = scipy.misc.imread(
#     "e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/louvre.jpg")


# imshow(content_image)
# plt.show()


def compute_content_cost(a_C, a_G):
    m, n_H, n_W, n_C = a_G.get_shape().as_list()

    a_C_unrolled = tf.transpose(a_C)
    a_G_unrolled = tf.transpose(a_G)

    J_content = tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled))) / (4 * n_H * n_W * n_C)

    return J_content


# tf.reset_default_graph()
# with tf.Session() as test:
#     tf.set_random_seed(1)
#     a_C = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
#     a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
#     J_content = compute_content_cost(a_C, a_G)
#     print("J_content = " + str(J_content.eval()))


# style_image = scipy.misc.imread(
#     "e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/monet_800600.jpg")


# imshow(style_image)
# plt.show()


def gram_matrix(A):
    GA = tf.matmul(A, tf.transpose(A))

    return GA


# tf.reset_default_graph()
# with tf.Session() as test:
#     tf.set_random_seed(1)
#     A = tf.random_normal([3, 2 * 1], mean=1, stddev=4)
#     GA = gram_matrix(A)
#     print("GA = " + str(GA.eval()))


def compute_layer_style_cost(a_S, a_G):
    m, n_H, n_W, n_C = a_G.get_shape().as_list()

    a_S = tf.reshape(a_S, [n_H * n_W, n_C])
    a_G = tf.reshape(a_G, [n_H * n_W, n_C])

    GS = gram_matrix(tf.transpose(a_S))
    GG = gram_matrix(tf.transpose(a_G))

    J_style_layer = tf.reduce_sum(tf.square(tf.subtract(GS, GG))) / (4 * tf.square(tf.to_float(n_H * n_W * n_C)))

    return J_style_layer


# tf.reset_default_graph()
# with tf.Session() as test:
#     tf.set_random_seed(1)
#     a_S = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
#     a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
#     J_style_layer = compute_layer_style_cost(a_S, a_G)
#     print("J_style_layer = " + str(J_style_layer.eval()))


STYLE_LAYERS = [
    ('conv1_1', 0.2),
    ('conv2_1', 0.2),
    ('conv3_1', 0.2),
    ('conv4_1', 0.2),
    ('conv5_1', 0.2)]


def compute_style_cost(model, STYLE_LAYERS):

    J_style = 0

    for layer_name, coeff in STYLE_LAYERS:
        out = model[layer_name]

        a_S = sess.run(out)

        a_G = out

        J_style_layer = compute_layer_style_cost(a_S, a_G)

        J_style += coeff * J_style_layer

    return J_style


def total_cost(J_content, J_style, alpha=10, beta=40):
    J = alpha * J_content + beta * J_style
    return J


# tf.reset_default_graph()
# with tf.Session() as test:
#     np.random.seed(3)
#     J_content = np.random.randn()
#     J_style = np.random.randn()
#     J = total_cost(J_content, J_style)
#     print("J = " + str(J))


tf.reset_default_graph()
sess = tf.InteractiveSession()

# content_image = scipy.misc.imread("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/louvre_small.jpg")
content_image = scipy.misc.imread("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/cat.jpg")
content_image = reshape_and_normalize_image(content_image)

style_image = scipy.misc.imread("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/images/style.jpg")
style_image = reshape_and_normalize_image(style_image)


generated_image = generate_noise_image(content_image)
imshow(generated_image[0])
plt.show()

model = load_vgg_model("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/imagenet-vgg-verydeep-19.mat")
sess.run(model['input'].assign(content_image))
out = model['conv4_2']
a_C = sess.run(out)
a_G = out
J_content = compute_content_cost(a_C, a_G)
sess.run(model['input'].assign(style_image))
J_style = compute_style_cost(model, STYLE_LAYERS)
J = total_cost(J_content, J_style, 10, 40)
optimizer = tf.train.AdamOptimizer(2.0)
train_step = optimizer.minimize(J)


def model_nn(sess, input_image, num_iterations=100):
    sess.run(tf.global_variables_initializer())
    sess.run(model['input'].assign(input_image))

    for i in range(num_iterations):
        sess.run(train_step)
        generated_image = sess.run(model['input'])
        if i % 10 == 0:
            Jt, Jc, Js = sess.run([J, J_content, J_style])
            print("Iteration " + str(i) + " :")
            print("total cost = " + str(Jt))
            print("content cost = " + str(Jc))
            print("style cost = " + str(Js))

            save_image("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/output/" + str(i) + ".png", generated_image)

    save_image("e:/code/Python/DeepLearning/Convolution model/week4/Neural Style Transfer/output/generatede_image.jpg", generated_image)

    return generated_image


model_nn(sess, generated_image)

三、總結

學到這裡,你可以建立通過神經風格轉移演算法產生的藝術影象,這也是你第一次建立的更新畫素值的模型優化演算法,而不是隻是更新神經網路的引數。深度學習有很多不同型別的模型,這只是其中一個。

從本節內容你需要記住的是:

  1. Neural Style Transfer is an algorithm that given a content image C and a style image S can generate an artistic image.
  2. It uses representations (hidden layer activations) based on a pretrained ConvNet. 
  3. The content cost function is computed using one hidden layer's activations.
  4. The style cost function for one layer is computed using the Gram matrix of that layer's activations. The overall style cost function is obtained using several hidden layers.
  5. Optimizing the total cost function results in synthesizing new images.