深度有趣 | 18 二次元頭像生成

二次元 · 發表 2018-09-20 10:17:02

摘要：圖片爬取自getchu.com/，是一個日本二次元遊戲網站，包含大量遊戲人物立繪，共爬取31,970張頭像擷取之前介紹的dlib可用於提取人臉，但不適用於二次元頭像使用OpenCV從每張圖片中擷取頭像部分，用到以下專案， github.com/nag...

圖片爬取自getchu.com/，是一個日本二次元遊戲網站，包含大量遊戲人物立繪，共爬取31,970張

頭像擷取

之前介紹的dlib可用於提取人臉，但不適用於二次元頭像

使用OpenCV從每張圖片中擷取頭像部分，用到以下專案， ofollow,noindex">github.com/nagadomi/lb…

對於檢測結果適當放大範圍，以包含更多人物細節

# -*- coding: utf-8 -*-

import cv2

cascade = cv2.CascadeClassifier('lbpcascade_animeface.xml')

image = cv2.imread('imgs/二次元頭像示例.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.equalizeHist(gray)
faces = cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(64, 64))
for i, (x, y, w, h) in enumerate(faces):
cx = x + w // 2
cy = y + h // 2
x0 = cx - int(0.75 * w)
x1 = cx + int(0.75 * w)
y0 = cy - int(0.75 * h)
y1 = cy + int(0.75 * h)
if x0 < 0:
x0 = 0
if y0 < 0:
y0 = 0
if x1 >= image.shape[1]:
x1 = image.shape[1] - 1
if y1 >= image.shape[0]:
y1 = image.shape[0] - 1
w = x1 - x0
h = y1 - y0
if w > h:
x0 = x0 + w // 2 - h // 2
x1 = x1 - w // 2 + h // 2
w = h
else:
y0 = y0 + h // 2 - w // 2
y1 = y1 - h // 2 + w // 2
h = w
face = image[y0: y0 + h, x0: x0 + w, :]
face = cv2.resize(face, (128, 128))
cv2.imwrite('face_%d.jpg' % i, face)
複製程式碼

標籤提取

使用 Illustration2Vec 從二次元圖片中抽取豐富的標籤， github.com/rezoo/illus…

Illustration2Vec用到 chainer 這個深度學習框架，以及一些其他庫，如果沒有則安裝

pip install chainer Pillow scikit-image
複製程式碼

Illustration2Vec可以完成以下三項功能：

將每張圖片表示為一個4096維的向量
指定閾值，並提取概率高於閾值的標籤
指定一些標籤，並返回對應的概率

舉個例子，提取全部可能的標籤，以0.5為閾值

# -*- coding: utf-8 -*-

import i2v
from imageio import imread

illust2vec = i2v.make_i2v_with_chainer('illust2vec_tag_ver200.caffemodel', 'tag_list.json')

img = imread('imgs/二次元頭像示例.jpg')
tags = illust2vec.estimate_plausible_tags([img], threshold=0.5)
print(tags)
tags = illust2vec.estimate_specific_tags([img], ['blue eyes', 'red hair'])
print(tags)
複製程式碼

也可以指定標籤並獲取對應的概率

[{'blue eyes': 0.9488178491592407, 'red hair': 0.0025324225425720215}]
複製程式碼

預處理

在伺服器上處理全部圖片，即擷取頭像、提取標籤

載入庫

# -*- coding: utf-8 -*-

import i2v
import cv2
import glob
import os
from imageio import imread
from tqdm import tqdm
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import pickle
複製程式碼

讀取圖片路徑

images = glob.glob('characters/*.jpg')
print(len(images))
複製程式碼

載入兩個模型

illust2vec = i2v.make_i2v_with_chainer('illust2vec_tag_ver200.caffemodel', 'tag_list.json')
cascade = cv2.CascadeClassifier('lbpcascade_animeface.xml')
OUTPUT_DIR = 'faces/'
if not os.path.exists(OUTPUT_DIR):
os.mkdir(OUTPUT_DIR)
複製程式碼

提取全部頭像，共檢測到27772張

num = 0
for x in tqdm(range(len(images))):
img_path = images[x]
image = cv2.imread(img_path)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.equalizeHist(gray)
faces = cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(64, 64))
for (x, y, w, h) in faces:
cx = x + w // 2
cy = y + h // 2
x0 = cx - int(0.75 * w)
x1 = cx + int(0.75 * w)
y0 = cy - int(0.75 * h)
y1 = cy + int(0.75 * h)
if x0 < 0:
x0 = 0
if y0 < 0:
y0 = 0
if x1 >= image.shape[1]:
x1 = image.shape[1] - 1
if y1 >= image.shape[0]:
y1 = image.shape[0] - 1
w = x1 - x0
h = y1 - y0
if w > h:
x0 = x0 + w // 2 - h // 2
x1 = x1 - w // 2 + h // 2
w = h
else:
y0 = y0 + h // 2 - w // 2
y1 = y1 - h // 2 + w // 2
h = w

face = image[y0: y0 + h, x0: x0 + w, :]
face = cv2.resize(face, (128, 128))
cv2.imwrite(os.path.join(OUTPUT_DIR, '%d.jpg' % num), face)
num += 1
print(num)
複製程式碼

感興趣的標籤包括以下34個：

13種頭髮顏色：blonde hair, brown hair, black hair, blue hair, pink hair, purple hair, green hair, red hair, silver hair, white hair, orange hair, aqua hair, grey hair
5種髮型：long hair, short hair, twintails, drill hair, ponytail
10種眼睛顏色：blue eyes, red eyes, brown eyes, green eyes, purple eyes, yellow eyes, pink eyes, aqua eyes, black eyes, orange eyes
6種其他屬性：blush, smile, open mouth, hat, ribbon, glasses

頭髮顏色、髮型和眼睛顏色取概率最高的一種，其他屬性概率高於0.25則以存在處理

fw = open('face_tags.txt', 'w')
tags = ['blonde hair', 'brown hair', 'black hair', 'blue hair', 'pink hair', 'purple hair', 'green hair', 
'red hair', 'silver hair', 'white hair', 'orange hair', 'aqua hair', 'grey hair',
'long hair', 'short hair', 'twintails', 'drill hair', 'ponytail',
'blue eyes', 'red eyes', 'brown eyes', 'green eyes', 'purple eyes', 'yellow eyes', 'pink eyes', 
'aqua eyes', 'black eyes', 'orange eyes',
'blush', 'smile', 'open mouth', 'hat', 'ribbon', 'glasses']
fw.write('id,' + ','.join(tags) + '\n')

images = glob.glob(os.path.join(OUTPUT_DIR, '*.jpg'))
for x in tqdm(range(len(images))):
img_path = images[x]
image = imread(img_path)
result = illust2vec.estimate_specific_tags([image], tags)[0]

hair_colors = [[h, result[h]] for h in tags[0:13]]
hair_colors.sort(key=lambda x:x[1], reverse=True)
for h in tags[0:13]:
if h == hair_colors[0][0]:
result[h] = 1
else:
result[h] = 0

hair_styles = [[h, result[h]] for h in tags[13:18]]
hair_styles.sort(key=lambda x:x[1], reverse=True)
for h in tags[13:18]:
if h == hair_styles[0][0]:
result[h] = 1
else:
result[h] = 0

eye_colors = [[h, result[h]] for h in tags[18:28]]
eye_colors.sort(key=lambda x:x[1], reverse=True)
for h in tags[18:28]:
if h == eye_colors[0][0]:
result[h] = 1
else:
result[h] = 0

for h in tags[28:]:
if result[h] > 0.25:
result[h] = 1
else:
result[h] = 0

fw.write(img_path + ',' + ','.join([str(result[t]) for t in tags]) + '\n')

fw.close()
複製程式碼

這樣一來，便得到了27772張二次元頭像，以及每張頭像對應的34個標籤值

獲取每張頭像的4096維向量表示

illust2vec = i2v.make_i2v_with_chainer("illust2vec_ver200.caffemodel")
img_all = []
vec_all = []
for x in tqdm(range(len(images))):
img_path = images[x]
image = imread(img_path)
vector = illust2vec.extract_feature([image])[0]
img_all.append(image / 255.)
vec_all.append(vector)
img_all = np.array(img_all)
vec_all = np.array(vec_all)
複製程式碼

隨機選擇2000張頭像，進行tSNE降維視覺化

from sklearn.manifold import TSNE
from imageio import imsave
data_index = np.arange(img_all.shape[0])
np.random.shuffle(data_index)
data_index = data_index[:2000]

tsne = TSNE(perplexity=30, n_components=2, init='pca', n_iter=5000)
two_d_vectors = tsne.fit_transform(vec_all[data_index, :])
puzzles = np.ones((6400, 6400, 3))
xmin = np.min(two_d_vectors[:, 0])
xmax = np.max(two_d_vectors[:, 0])
ymin = np.min(two_d_vectors[:, 1])
ymax = np.max(two_d_vectors[:, 1])

for i, vector in enumerate(two_d_vectors):
x, y = two_d_vectors[i, :]
x = int((x - xmin) / (xmax - xmin) * (6400 - 128) + 64)
y = int((y - ymin) / (ymax - ymin) * (6400 - 128) + 64)
puzzles[y - 64: y + 64, x - 64: x + 64, :] = img_all[data_index[i]]
imsave('二次元頭像降維視覺化.png', puzzles)
複製程式碼

視覺化結果如下，相似的頭像確實被聚到了一起

模型

使用ACGAN結構，但和CelebA中用的DCGAN不同，這次使用更深更復雜的網路來實現G和D，參考自SRGAN， arxiv.org/abs/1609.04…

生成器結構如下：

使用16個殘差塊，即ResNet中的shortcut思想
使用Sub-pixel CNN代替deconvolution， arxiv.org/abs/1609.05…

Sub-pixel CNN原理如下，把多個層拼接成一個層，從而達到增加高度和寬度、減少深度的目的

判別器結構如下，使用10個殘差塊，輸出端包括兩支，分別完成判別和分類任務

實現

載入庫

# -*- coding: utf-8 -*-

import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import os
from imageio import imread, imsave, mimsave
import glob
from tqdm import tqdm
複製程式碼

載入圖片

images = glob.glob('faces/*.jpg')
print(len(images))
複製程式碼

載入標籤

tags = pd.read_csv('face_tags.txt')
tags.index = tags['id']
tags.head()
複製程式碼

定義一些常量、網路tensor、輔助函式，批大小設為2的冪比較合適，這裡設為64，考慮學習率衰減

batch_size = 64
z_dim = 128
WIDTH = 128
HEIGHT = 128
LABEL = 34
LAMBDA = 0.05
BETA = 3

OUTPUT_DIR = 'samples'
if not os.path.exists(OUTPUT_DIR):
os.mkdir(OUTPUT_DIR)

X = tf.placeholder(dtype=tf.float32, shape=[batch_size, HEIGHT, WIDTH, 3], name='X')
X_perturb = tf.placeholder(dtype=tf.float32, shape=[batch_size, HEIGHT, WIDTH, 3], name='X_perturb')
Y = tf.placeholder(dtype=tf.float32, shape=[batch_size, LABEL], name='Y')
noise = tf.placeholder(dtype=tf.float32, shape=[batch_size, z_dim], name='noise')
noise_y = tf.placeholder(dtype=tf.float32, shape=[batch_size, LABEL], name='noise_y')
is_training = tf.placeholder(dtype=tf.bool, name='is_training')

global_step = tf.Variable(0, trainable=False)
add_global = global_step.assign_add(1)
initial_learning_rate = 0.0002
learning_rate = tf.train.exponential_decay(initial_learning_rate, global_step=global_step, decay_steps=20000, decay_rate=0.5)

def lrelu(x, leak=0.2):
return tf.maximum(x, leak * x)

def sigmoid_cross_entropy_with_logits(x, y):
return tf.nn.sigmoid_cross_entropy_with_logits(logits=x, labels=y)

def conv2d(inputs, kernel_size, filters, strides, padding='same', use_bias=True):
return tf.layers.conv2d(inputs=inputs, kernel_size=kernel_size, filters=filters, strides=strides, padding=padding, use_bias=use_bias)

def batch_norm(inputs, is_training=is_training, decay=0.9):
return tf.contrib.layers.batch_norm(inputs, is_training=is_training, decay=decay)
複製程式碼

判別器部分

def d_block(inputs, filters):
h0 = lrelu(conv2d(inputs, 3, filters, 1))
h0 = conv2d(h0, 3, filters, 1)
h0 = lrelu(tf.add(h0, inputs))
return h0

def discriminator(image, reuse=None):
with tf.variable_scope('discriminator', reuse=reuse):
h0 = image

f = 32
for i in range(5):
if i < 3:
h0 = lrelu(conv2d(h0, 4, f, 2))
else:
h0 = lrelu(conv2d(h0, 3, f, 2))
h0 = d_block(h0, f)
h0 = d_block(h0, f)
f = f * 2

h0 = lrelu(conv2d(h0, 3, f, 2))
h0 = tf.contrib.layers.flatten(h0)
Y_ = tf.layers.dense(h0, units=LABEL)
h0 = tf.layers.dense(h0, units=1)
return h0, Y_
複製程式碼

生成器部分

def g_block(inputs):
h0 = tf.nn.relu(batch_norm(conv2d(inputs, 3, 64, 1, use_bias=False)))
h0 = batch_norm(conv2d(h0, 3, 64, 1, use_bias=False))
h0 = tf.add(h0, inputs)
return h0

def generator(z, label):
with tf.variable_scope('generator', reuse=None):
d = 16
z = tf.concat([z, label], axis=1)
h0 = tf.layers.dense(z, units=d * d * 64)
h0 = tf.reshape(h0, shape=[-1, d, d, 64])
h0 = tf.nn.relu(batch_norm(h0))
shortcut = h0

for i in range(16):
h0 = g_block(h0)

h0 = tf.nn.relu(batch_norm(h0))
h0 = tf.add(h0, shortcut)

for i in range(3):
h0 = conv2d(h0, 3, 256, 1, use_bias=False)
h0 = tf.depth_to_space(h0, 2)
h0 = tf.nn.relu(batch_norm(h0))

h0 = tf.layers.conv2d(h0, kernel_size=9, filters=3, strides=1, padding='same', activation=tf.nn.tanh, name='g', use_bias=True)
return h0
複製程式碼

損失函式，這裡的gp項來自DRAGAN， arxiv.org/abs/1705.07… ，WGAN使用真實樣本和合成樣本的插值，而DRAGAN使用真實樣本和干擾樣本的插值

g = generator(noise, noise_y)
d_real, y_real = discriminator(X)
d_fake, y_fake = discriminator(g, reuse=True)

loss_d_real = tf.reduce_mean(sigmoid_cross_entropy_with_logits(d_real, tf.ones_like(d_real)))
loss_d_fake = tf.reduce_mean(sigmoid_cross_entropy_with_logits(d_fake, tf.zeros_like(d_fake)))
loss_g_fake = tf.reduce_mean(sigmoid_cross_entropy_with_logits(d_fake, tf.ones_like(d_fake)))

loss_c_real = tf.reduce_mean(sigmoid_cross_entropy_with_logits(y_real, Y))
loss_c_fake = tf.reduce_mean(sigmoid_cross_entropy_with_logits(y_fake, noise_y))

loss_d = loss_d_real + loss_d_fake + BETA * loss_c_real
loss_g = loss_g_fake + BETA * loss_c_fake

alpha = tf.random_uniform(shape=[batch_size, 1, 1, 1], minval=0., maxval=1.)
interpolates = alpha * X + (1 - alpha) * X_perturb
grad = tf.gradients(discriminator(interpolates, reuse=True)[0], [interpolates])[0]
slop = tf.sqrt(tf.reduce_sum(tf.square(grad), axis=[1]))
gp = tf.reduce_mean((slop - 1.) ** 2)
loss_d += LAMBDA * gp

vars_g = [var for var in tf.trainable_variables() if var.name.startswith('generator')]
vars_d = [var for var in tf.trainable_variables() if var.name.startswith('discriminator')]
複製程式碼

定義優化器

update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
optimizer_d = tf.train.AdamOptimizer(learning_rate=learning_rate, beta1=0.5).minimize(loss_d, var_list=vars_d)
optimizer_g = tf.train.AdamOptimizer(learning_rate=learning_rate, beta1=0.5).minimize(loss_g, var_list=vars_g)
複製程式碼

合成圖片的函式

def montage(images):
if isinstance(images, list):
images = np.array(images)
img_h = images.shape[1]
img_w = images.shape[2]
n_plots = int(np.ceil(np.sqrt(images.shape[0])))
if len(images.shape) == 4 and images.shape[3] == 3:
m = np.ones(
(images.shape[1] * n_plots + n_plots + 1,
images.shape[2] * n_plots + n_plots + 1, 3)) * 0.5
elif len(images.shape) == 4 and images.shape[3] == 1:
m = np.ones(
(images.shape[1] * n_plots + n_plots + 1,
images.shape[2] * n_plots + n_plots + 1, 1)) * 0.5
elif len(images.shape) == 3:
m = np.ones(
(images.shape[1] * n_plots + n_plots + 1,
images.shape[2] * n_plots + n_plots + 1)) * 0.5
else:
raise ValueError('Could not parse image shape of {}'.format(images.shape))
for i in range(n_plots):
for j in range(n_plots):
this_filter = i * n_plots + j
if this_filter < images.shape[0]:
this_img = images[this_filter]
m[1 + i + i * img_h:1 + i + (i + 1) * img_h,
1 + j + j * img_w:1 + j + (j + 1) * img_w] = this_img
return m
複製程式碼

整理資料

X_all = []
Y_all = []
for i in tqdm(range(len(images))):
image = imread(images[i])
image = (image / 255. - 0.5) * 2
X_all.append(image)

y = list(tags.loc[images[i]])
Y_all.append(y[1:])

X_all = np.array(X_all)
Y_all = np.array(Y_all)
print(X_all.shape, Y_all.shape)
複製程式碼

定義隨機產生標籤的函式，原始資料中標籤分佈不均勻，但我們希望G能學到各種標籤，所以均勻地生成各類標籤

def get_random_tags():
y = np.random.uniform(0.0, 1.0, [batch_size, LABEL]).astype(np.float32)
y[y > 0.75] = 1
y[y <= 0.75] = 0
for i in range(batch_size):
hc = np.random.randint(0, 13)
hs = np.random.randint(13, 18)
ec = np.random.randint(18, 28)
y[i, :28] = 0
y[i, hc] = 1 # hair color
y[i, hs] = 1 # hair style
y[i, ec] = 1 # eye color
return y
複製程式碼

訓練模型，CelebA中男女比例均衡，因此每次迭代隨機取一批資料訓練即可。但現在由於原始資料中各類標籤分佈不均勻，所以需要完整地迭代資料

sess = tf.Session()
sess.run(tf.global_variables_initializer())
z_samples = np.random.uniform(-1.0, 1.0, [batch_size, z_dim]).astype(np.float32)
y_samples = get_random_tags()
for i in range(batch_size):
y_samples[i, :28] = 0
y_samples[i, i // 8 % 13] = 1 # hair color
y_samples[i, i // 8 % 5 + 13] = 1 # hair style
y_samples[i, i // 8 % 10 + 18] = 1 # eye color
samples = []
loss = {'d': [], 'g': []}

offset = 0
for i in tqdm(range(60000)):
if offset + batch_size > X_all.shape[0]:
offset = 0
if offset == 0:
data_index = np.arange(X_all.shape[0])
np.random.shuffle(data_index)
X_all = X_all[data_index, :, :, :]
Y_all = Y_all[data_index, :]
X_batch = X_all[offset: offset + batch_size, :, :, :]
Y_batch = Y_all[offset: offset + batch_size, :]
X_batch_perturb = X_batch + 0.5 * X_batch.std() * np.random.random(X_batch.shape)
offset += batch_size

n = np.random.uniform(-1.0, 1.0, [batch_size, z_dim]).astype(np.float32)
ny = get_random_tags()
_, d_ls = sess.run([optimizer_d, loss_d], feed_dict={X: X_batch, X_perturb: X_batch_perturb, Y: Y_batch, noise: n, noise_y: ny, is_training: True})

n = np.random.uniform(-1.0, 1.0, [batch_size, z_dim]).astype(np.float32)
ny = get_random_tags()
_, g_ls = sess.run([optimizer_g, loss_g], feed_dict={noise: n, noise_y: ny, is_training: True})

loss['d'].append(d_ls)
loss['g'].append(g_ls)

_, lr = sess.run([add_global, learning_rate])

if i % 500 == 0:
print(i, d_ls, g_ls, lr)
gen_imgs = sess.run(g, feed_dict={noise: z_samples, noise_y: y_samples, is_training: False})
gen_imgs = (gen_imgs + 1) / 2
imgs = [img[:, :, :] for img in gen_imgs]
gen_imgs = montage(imgs)
plt.axis('off')
plt.imshow(gen_imgs)
imsave(os.path.join(OUTPUT_DIR, 'sample_%d.jpg' % i), gen_imgs)
plt.show()
samples.append(gen_imgs)

plt.plot(loss['d'], label='Discriminator')
plt.plot(loss['g'], label='Generator')
plt.legend(loc='upper right')
plt.savefig('Loss.png')
plt.show()
mimsave(os.path.join(OUTPUT_DIR, 'samples.gif'), samples, fps=10)
複製程式碼

生成的二次元頭像如下，每一行對應的頭髮顏色、髮型、眼睛顏色相同，其他屬性隨機。少部分結果不太好，可能是某些噪音或條件的問題

儲存模型

saver = tf.train.Saver()
saver.save(sess, './anime_acgan', global_step=60000)
複製程式碼

在單機上載入模型，進行以下三項嘗試：

按原始標籤分佈隨機生成樣本
生成指定標籤的樣本
固定噪音，按原始標籤分佈生成樣本

# -*- coding: utf-8 -*-

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from imageio import imsave

batch_size = 64
z_dim = 128
LABEL = 34

def montage(images):
if isinstance(images, list):
images = np.array(images)
img_h = images.shape[1]
img_w = images.shape[2]
n_plots = int(np.ceil(np.sqrt(images.shape[0])))
if len(images.shape) == 4 and images.shape[3] == 3:
m = np.ones(
(images.shape[1] * n_plots + n_plots + 1,
images.shape[2] * n_plots + n_plots + 1, 3)) * 0.5
elif len(images.shape) == 4 and images.shape[3] == 1:
m = np.ones(
(images.shape[1] * n_plots + n_plots + 1,
images.shape[2] * n_plots + n_plots + 1, 1)) * 0.5
elif len(images.shape) == 3:
m = np.ones(
(images.shape[1] * n_plots + n_plots + 1,
images.shape[2] * n_plots + n_plots + 1)) * 0.5
else:
raise ValueError('Could not parse image shape of {}'.format(images.shape))
for i in range(n_plots):
for j in range(n_plots):
this_filter = i * n_plots + j
if this_filter < images.shape[0]:
this_img = images[this_filter]
m[1 + i + i * img_h:1 + i + (i + 1) * img_h,
1 + j + j * img_w:1 + j + (j + 1) * img_w] = this_img
return m

def get_random_tags():
y = np.random.uniform(0.0, 1.0, [batch_size, LABEL]).astype(np.float32)
p_other = [0.6, 0.6, 0.25, 0.04488882, 0.3, 0.05384738]
for i in range(batch_size):
for j in range(len(p_other)):
if y[i, j + 28] < p_other[j]:
y[i, j + 28] = 1
else:
y[i, j + 28] = 0

phc = [0.15968645, 0.21305391, 0.15491921, 0.10523116, 0.07953927, 0.09508879, 0.03567429, 0.07733163, 0.03157895, 0.01833307, 0.02236442, 0.00537514, 0.00182371]
phs = [0.52989922,0.37101264,0.12567589,0.00291153,0.00847864]
pec = [0.28350664, 0.15760678, 0.17862742, 0.13412254, 0.14212126, 0.0543913, 0.01020637, 0.00617501, 0.03167493, 0.00156775]
for i in range(batch_size):
y[i, :28] = 0

hc = np.random.random()
for j in range(len(phc)):
if np.sum(phc[:j]) < hc < np.sum(phc[:j + 1]):
y[i, j] = 1
break

hs = np.random.random()
for j in range(len(phs)):
if np.sum(phs[:j]) < hs < np.sum(phs[:j + 1]):
y[i, j + 13] = 1
break

ec = np.random.random()
for j in range(len(pec)):
if np.sum(pec[:j]) < ec < np.sum(pec[:j + 1]):
y[i, j + 18] = 1
break
return y

sess = tf.Session()
sess.run(tf.global_variables_initializer())

saver = tf.train.import_meta_graph('./anime_acgan-60000.meta')
saver.restore(sess, tf.train.latest_checkpoint('./'))

graph = tf.get_default_graph()
g = graph.get_tensor_by_name('generator/g/Tanh:0')
noise = graph.get_tensor_by_name('noise:0')
noise_y = graph.get_tensor_by_name('noise_y:0')
is_training = graph.get_tensor_by_name('is_training:0')

# 隨機生成樣本
z_samples = np.random.uniform(-1.0, 1.0, [batch_size, z_dim]).astype(np.float32)
y_samples = get_random_tags()
gen_imgs = sess.run(g, feed_dict={noise: z_samples, noise_y: y_samples, is_training: False})
gen_imgs = (gen_imgs + 1) / 2
imgs = [img[:, :, :] for img in gen_imgs]
gen_imgs = montage(imgs)
gen_imgs = np.clip(gen_imgs, 0, 1)
imsave('1_二次元頭像隨機生成.jpg', gen_imgs)

# 生成指定標籤的樣本
all_tags = ['blonde hair', 'brown hair', 'black hair', 'blue hair', 'pink hair', 'purple hair', 'green hair', 'red hair', 'silver hair', 'white hair', 'orange hair', 'aqua hair', 'grey hair', 'long hair', 'short hair', 'twintails', 'drill hair', 'ponytail', 'blue eyes', 'red eyes', 'brown eyes', 'green eyes', 'purple eyes', 'yellow eyes', 'pink eyes', 'aqua eyes', 'black eyes', 'orange eyes', 'blush', 'smile', 'open mouth', 'hat', 'ribbon', 'glasses']
for i, tags in enumerate([['blonde hair', 'twintails', 'blush', 'smile', 'ribbon', 'red eyes'], ['silver hair', 'long hair', 'blush', 'smile', 'open mouth', 'blue eyes']]):
z_samples = np.random.uniform(-1.0, 1.0, [batch_size, z_dim]).astype(np.float32)
y_samples = np.zeros([1, LABEL])
for tag in tags:
y_samples[0, all_tags.index(tag)] = 1
y_samples = np.repeat(y_samples, batch_size, 0)
gen_imgs = sess.run(g, feed_dict={noise: z_samples, noise_y: y_samples, is_training: False})
gen_imgs = (gen_imgs + 1) / 2
imgs = [img[:, :, :] for img in gen_imgs]
gen_imgs = montage(imgs)
gen_imgs = np.clip(gen_imgs, 0, 1)
imsave('%d_二次元頭像指定標籤.jpg' % (i + 2), gen_imgs)

# 固定噪音隨機標籤
z_samples = np.random.uniform(-1.0, 1.0, [1, z_dim]).astype(np.float32)
z_samples = np.repeat(z_samples, batch_size, 0)
y_samples = get_random_tags()
gen_imgs = sess.run(g, feed_dict={noise: z_samples, noise_y: y_samples, is_training: False})
gen_imgs = (gen_imgs + 1) / 2
imgs = [img[:, :, :] for img in gen_imgs]
gen_imgs = montage(imgs)
gen_imgs = np.clip(gen_imgs, 0, 1)
imsave('4_二次元頭像固定噪音.jpg', gen_imgs)
複製程式碼

按原始標籤分佈隨機生成樣本

生成金髮、雙馬尾、臉紅、微笑、繫絲帶、紅眼睛的頭像

生成銀髮、長髮、臉紅、微笑、張嘴、藍眼睛的頭像

固定噪音隨機標籤，使得頭像主體大致相同但各種細節不一樣

掌握以上內容後，也可以在CelebA上訓練受40個01屬性控制的ACGAN模型，而且比二次元頭像更簡單一些

深度有趣 | 18 二次元頭像生成

頭像擷取

標籤提取

預處理

模型

實現

您可能也會喜歡…