CelebA資料集簡單介紹，及做人臉識別時資料集的處理

阿新 • • 發佈：2018-11-10

CeleA是香港中文大學的開放資料，包含10177個名人身份的202599張圖片，並且都做好了特徵標記，這對人臉相關的訓練是非常好用的資料集。網盤連結

資料包含了三個資料夾，一個描述文件如下：

img資料夾下有兩個壓縮包

img_align_celeba.zip & img_align_celeba_png.7z

我選擇下載的是

img_align_celeba.zip

解壓後的內容是包含202599張圖片，如下

Anno資料夾下有個文件identity_CelebA，部分內容如下：

000001.jpg 2880
000002.jpg 2937
000003.jpg 8692
000004.jpg 5805
000005.jpg 9295
000006.jpg 4153
000007.jpg 9040
000008.jpg 6369
000009.jpg 3332
000010.jpg 612

此文件是10,177個名人身份標識，每張圖片後面的數字即是該圖片對應的標籤；

下面我們利用這兩個文件處理這個資料集：

首先我們利用dlib這個庫做人臉檢測，將人臉框出並儲存下來，程式碼如下：

import dlib
import cv2

import os

# \B4\AB\C8\EB\B5\C4\C3\FC\C1\EE\D0в\CE\CA\FD
def read_txt_file(file):
inde=[]
with open(file,'r') as f:
lines=f.readlines()

for line in lines:
items=line.split(' ')
inde.append(items[0])

return inde

def face_path(path):
file_paths=[]
file_path=os.listdir(path)

file_path.sort(key=lambda x:int(x[:-4]))
for files in file_path:

paths=path+'/'+files
file_paths.append(paths)
return file_paths

def face_detction():
inde=read_txt_file('/home/zy/PycharmProjects/CelebA/identity_CelebA.txt')
file_path=face_path('/home/zy/PycharmProjects/CelebA/img_align_celeba')
i=1
for f in file_path:
img = cv2.imread(f, cv2.IMREAD_COLOR)

b, g, r = cv2.split(img)
img2 = cv2.merge([r, g, b])
detector = dlib.get_frontal_face_detector()
dets = detector(img, 1)
if len(dets)==0:
print(i)
i = i + 1
print("Number of faces detected: {}".format(len(dets)))

for index, face in enumerate(dets):
print('face {}; left {}; top {}; right {}; bottom {}'.format(index, face.left(), face.top(), face.right(), face.bottom()))

left = face.left()
top = face.top()
right = face.right()
bottom = face.bottom()
# cv2.rectangle(img, (left, top), (right, bottom), (0, 255, 0), 3)
imgs=img[top:bottom,left:right]
cv2.imwrite('/home/zy/PycharmProjects/CelebA/cropdata'+'/'+inde[i],imgs)

i=i+1

cv2.destroyAllWindows()

face_detction()
人臉檢測完，你會發現，有的人臉不能檢測出來，所以需要根據identity_CelebA文件重新制作一個圖片路徑，與對應標籤文件，程式碼如下：

import os
import cv2
img_path='/home/zy/PycharmProjects/CelebA/cropdata'
text_file='/home/zy/PycharmProjects/CelebA/identity_CelebA.txt'
file_path=os.listdir(img_path)
file_path.sort(key=lambda x:int(x[:-4]))

def train_path():
with open(text_file,'r') as f:
inde=[]

lines=f.readlines()
print(lines)
for i in file_path:
print(i)

for line in lines:
items = line.split(' ')
if i==items[0]:

img_paths=img_path+'/'+i+" "+items[1]
inde.append(img_paths)

return inde
data_set=train_path()
with open('trainggg_text', "w") as f:

for i in range(len(data_set)):
f.write(data_set[i])
如果想要使資料集變成一個資料夾下為同一個人可以使用如下程式碼：

with open('./trainggg_text','r') as f:
lines = f.readlines()
print(lines[1])
inde=[]
paths=[]
for i in lines:
i = i.strip('\n')
item = i.split(" ")
paths.append(item[0])
inde.append(item[1])
# print(inde[2])
for j in range(11000):
j = j + 1
print(j)
os.makedirs('./ace/'+str(j)+'/'+str(0))

# path=os.path.join('./ace',os.mkdir(str(j)))
# paths=os.path.join(path,os.mkdir(str(0)))
l=0
for k,element in enumerate(inde):
# print('ss',k)
if j==int(element):
# print('s')
l=l+1
img=cv2.imread(paths[k])
# print(img)
cv2.imwrite('./ace/'+str(j)+'/'+str(0)+'/'+'zy'+str(l)+'.jpg',img)
# cv2.imwrite('./ace/zy'+str(j)+str(l)+'.jpg',img)
# print('dd')
---------------------
作者：益達888
來源：CSDN
原文：https://blog.csdn.net/qq_29023939/article/details/81299178?utm_source=copy
版權宣告：本文為博主原創文章，轉載請附上博文連結！

CelebA資料集簡單介紹，及做人臉識別時資料集的處理

CelebA資料集簡單介紹，及做人臉識別時資料集的處理

ObjectOutputStream 和 ObjectInputStream類的簡單介紹，及運用。

java_正則簡單介紹，正則匹配頁面時經常會遇見各種不匹配，下面是我copy過來的一些正則語法嘗試和一些常用正則表示式

java中幾種Map在什麼情況下使用，並簡單介紹原因及原理

資料加解密基礎知識介紹，及Java實現Base64加密

wireshark怎麼抓包、wireshark抓包詳細圖文教程，簡單介紹（及wireshark與wireshark legacy差別）

Tomcat的簡單介紹，安裝，以及簡單的配置運用

集群介紹，keepalived介紹，使用keepalived配置高可用集群

負載均衡集群介紹，LVS介紹，LVS的調度算法，LVS的NAT模式搭建

18.1-18.5 集群介紹，用keepalived配置高可用集群

集群介紹，keepalived介紹，keepalived配置高可用集群

集群介紹，keepalived介紹，用keepalived配置高可用集群

負載均衡集群介紹，LVS介紹，LVS的調度算法，LVSNAT模式搭建

Qt的簡單介紹，發展和由來

Python基本資料型別簡單介紹

mongodb副本集簡單介紹和建立

RakNet簡單介紹，最新版本分享

料號禁用，在做履行接受時，資料無法傳送AR介面

JS基礎知識（一）【資料型別基本介紹，檢測資料型別端方法簡介】

Android 資料夾簡單介紹

CelebA資料集簡單介紹，及做人臉識別時資料集的處理

相關推薦