使用tfrecord建立自己的數據集

阿新 • • 發佈：2017-12-18

解碼 res bytes slist 關於 error font 需要 orm

註意事項：

1.關於輸入圖像格式的問題

使用io.imread()的時，根據輸入圖像確定as_grey的參數值。轉化為字符串之後(image.tostring) ，最後輸出看下image_raw的長度。因為不同的圖像編碼格式，存儲方式不同。

我讀入的灰度圖jpeg格式，類型是int64,image_raw的大小是圖像的大小的8倍。但如果是RGB圖像，則統一類型是uint8。確定了類型，在之後的解碼（decode_raw）中，需要將type設置和存儲方式同樣的類型。

根據image_raw的長度和原圖像大小，推算一下使用的類型，常用的是uint8,int32,int64.

2.轉化成tfrecords的時間有點長，需要等待。

import os
import tensorflow as tf
import numpy as np
import skimage.io as io
import matplotlib.pyplot as plt
import cv2
def get_data (file_path):
    data = []
    label = []
    for dirs in os.listdir(file_path):
        temp_path = os.path.join(file_path,dirs)
        i  
=0
        for dirss in os.listdir(temp_path):
            data.append(os.path.join(temp_path,dirss))
        num_img = len(os.listdir(temp_path))
        label = np.append(label,num_img*[1])
    temp = np.array([data,label])
    temp = temp.transpose()
    np.random.shuffle(temp)
    image_list = list(temp[:,0])
    label_list  
= list(temp[:,1])
    label_list = [int(float(i)) for i in label_list]
    return image_list,label_list
# 轉化成字符串
def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def convert_tfrecord(images,labels,save_filename):
    writer = tf.python_io.TFRecordWriter(save_filename)
    print("Transform start....")
    num_examples= len(labels)
    if np.shape(images)[0]!=num_examples:
        raise ValueError(‘Images size %d does not match label size %d.‘ % (images.shape[0], num_examples))
    for index in np.arange(0,num_examples):
        try:
            image = io.imread(images[index],as_grey=False)
            #image = tf.image.decode_jpeg(images[index])
            #print(image.shape)
            image_raw = image.tostring()
            #print(len(image_raw))
            example = tf.train.Example(features = tf.train.Features(feature={
                ‘label‘ :_int64_feature(int(labels[index])),
                ‘image_raw‘:_bytes_feature(image_raw)
            }))
            writer.write(example.SerializeToString())
        except IOError as e:
            print(‘Could not read:‘,images[index])
            print(‘error :%s Skip it !\n‘%e)
    writer.close()
    print("success!")

def read_and_decode(tfrecords_file,batch_size):
    reader = tf.TFRecordReader()
    filename_queue = tf.train.string_input_producer([tfrecords_file])
    _,serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(
        serialized_example,
        features={
            ‘label‘: tf.FixedLenFeature([],tf.int64),
            ‘image_raw‘: tf.FixedLenFeature([], tf.string)
        }
    )
    #print(features[‘image_raw‘])
    capacity = 1000+3*batch_size
    image = tf.decode_raw(features[‘image_raw‘],tf.uint8)
    label = tf.cast(features[‘label‘],tf.int32)
    #image = tf.image.resize_images(image,[300, 200, 1])
    image = tf.reshape(image,[200,300,3])
    image_batch,label_batch = tf.train.batch([image,label],
                                             batch_size=batch_size,
                                             capacity=capacity)
    image_batch = tf.image.resize_image_with_crop_or_pad(image_batch,100,100)
    image_batch = tf.cast(image_batch, tf.float32) * (1. / 255)
    return image_batch,label_batch
def plot_images(images, labels):
    ‘‘‘plot one batch size
    ‘‘‘
    for i in np.arange(0, 2):
        plt.subplot(3, 3, i + 1)
        plt.axis(‘off‘)
        # plt.title((labels[i] - 1), fontsize = 14)
        plt.subplots_adjust(top=1)
        print(labels[i])
        print(images.shape)
        # print(images[i].shape)
        plt.imshow(images[i][:,:,:])
    plt.show()
def train():
    image,label = get_data(‘E:\syn_data‘)
    convert_tfrecord(image,label,‘1.tfrecords‘)
    x_batch, y_batch = read_and_decode(‘1.tfrecords‘, batch_size=2)
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord)
        try:
            i=0
            while not coord.should_stop() and i<3:
                     # just plot one batch size
                image, label = sess.run([x_batch, y_batch])
                plot_images(image, label)
                i+=1
        except tf.errors.OutOfRangeError:
            print(‘done!‘)
        finally:
            coord.request_stop()
        coord.join(threads)

#train()

使用tfrecord建立自己的數據集

FastRCNN 訓練自己數據集 (1編譯配置)

backend key article tail back art model plot osc http://www.cnblogs.com/louyihang-loves-baiyan/p/4885659.html 按照博客的教程配置，但自己在服務器上配置時，USE_C

深度學習（tensorflow） —— 自己數據集讀取opencv

spa 屬於有效測試大小打開文件需要深度學習 ray 先來看一下我們的目錄： dataset1 和creat_dataset.py 屬於同一目錄 mergeImg1 和mergeImg2 為Dataset1的兩子目錄（兩類為例子）目錄中存儲圖像等

使用tfrecord建立自己的數據集

解碼 res bytes slist 關於 error font 需要 orm 註意事項： 1.關於輸入圖像格式的問題使用io.imread()的時，根據輸入圖像確定as_grey的參數值。轉化為字符串之後(image.tostring) ，最後輸出看下imag

學習筆記TF016:CNN實現、數據集、TFRecord、加載圖像、模型、訓練、調試

quest oba lose 神經元 byte 足夠 jpg eight 值轉換 AlexNet(Alex Krizhevsky,ILSVRC2012冠軍)適合做圖像分類。層自左向右、自上向下讀取，關聯層分為一組，高度、寬度減小，深度增加。深度增加減少網絡計算量。訓練模

【轉載】 Faster-RCNN+ZF用自己的數據集訓練模型(Matlab版本)

cmp fin ont -m lac tails ram pos 識別說明：本博文假設你已經做好了自己的數據集，該數據集格式和VOC2007相同。下面是訓練前的一些修改。（做數據集的過程可以看http://blog.csdn.net/sinat_30071459/art

tensorflowxun訓練自己的數據集之從tfrecords讀取數據

str 兩個圖片文件 lines 註意 file ans span 數據集　　當訓練數據量較小時，采用直接讀取文件的方式，當訓練數據量非常大時，直接讀取文件的方式太耗內存，這時應采用高效的讀取方法，讀取tfrecords文件，這其實是一種二進制文件。tensorflow

目標檢測算法SSD在window環境下GPU配置訓練自己的數據集

等等過程采集 span 數據轉換都是 too bsp nvidia 由於最近想試一下牛掰的目標檢測算法SSD。於是乎，自己做了幾千張數據（實際只有幾百張，利用數據擴充算法比如鏡像，噪聲，切割，旋轉等擴充到了幾千張，其實還是很不夠）。於是在網上找了相關的介紹，自己處理數

【Tensorflow系列】使用Inception_resnet_v2訓練自己的數據集並用Tensorboard監控

process blog exc 系統參數 ota 可視化自己實現 print loss 【寫在前面】用Tensorflow(TF)已實現好的卷積神經網絡（CNN）模型來訓練自己的數據集，驗證目前較成熟模型在不同數據集上的準確度，如Inception_V3, VGG16

貓狗大戰的TFrecord數據集制作

AD load example std contest from string listdir label import tensorflow as tfimport numpy as npimport osfrom PIL import Image#沒有下面兩句德華會出現

可變卷積Deforable ConvNet 遷移訓練自己的數據集 MXNet框架 GPU版

pascal classes sdn 獲取數據 ide 實驗 one sets div 【引言】最近在用可變卷積的rfcn 模型遷移訓練自己的數據集， MSRA官方使用的MXNet框架環境搭建及配置：http://www.cnblogs.com/andre-ma/p/8

一次制作自己的VOC格式數據集經歷

ron 問題 with fin pychar path mat num 集合因為準備訓練keras-yolo3，開源代碼上給出了voc_annotation.py文件，只要將自己的數據格式處理成VOC格式，那麽運行voc_annotation.py就可以將自己的數據集處理

YOLOv3訓練自己的數據集（還在學習中）

tail x64 自己 bubuko lov link win10 info 問題其他比較好的參考鏈接： YOLOv3官網鏈接GitHub：https://github.com/AlexeyAB/darkne Yolov3+windows10+VS2015部署安裝：htt

tfrecord數據集訓練驗證-貓狗大戰

圖片大小 cat rac exc 兩個 bin span loss error: #!/usr/bin/env python # -*- coding:utf-8 -*- from mk_tfrecord import * #from model import * fr

pytorch人臉識別——自己制作數據集

遇到 roo sent 模型 optimizer seq orm ini split 這是一篇面向新手的博文：因為本人也是新手，記錄一下自己在做這個項目遇到的大大小小的坑。按照下面的例子寫就好了 import torch as t from torch.utils im

ubuntu yolov2 訓練自己的數據集

list lib backup 工程可能 sin define region stream 項目需求+鍛煉自己，嘗試用yolov2跑自己的數據集，中間遇到了很多問題，記下來防止忘記一、數據集首先發現由於物體特殊沒有合適的現成的數據集使用，所以只好自己標註，為了減少

使用labelme制作自己的數據集

轉化通過 json 路徑 nac 輸入a .json data 文件 # python3 conda create --name=labelme python=3.6 source activate labelme # conda install -c cond

數據集

機器學習http://moreno.ss.uci.edu/data.html http://archive.ics.uci.edu/ml/ 海量數據（又稱大數據）已經成為各大互聯網企業面臨的最大問題，如何處理海量數據，提供更好的解決方案，是目前相當熱門的一個話題。類似MapReduce、 Hadoop等架構的

[數據集]新浪微博數據集MicroblogPCU

sets learning lun epo con 新浪摘要 get 關系數據集下載地址：下載摘要：MicroblogPCU是從新浪微博採集到的。它能夠被用於研究機器學習方法和社會關系研究。這個數據集被原作者用於探索微博中的spammers（發送垃圾信息的人）。

R語言重要數據集分析研究——搞清數據的由來

pan .cn logs ges 語言 lang -1 r語 tex 搞清數據的由來作者：李雪麗資料來源：百度百科 R語言重要數據集分析研究——搞清數據的由來

ArcGIS 網絡分析[4] 網絡數據集深入淺出之連通性、網絡數據集的屬性及轉彎要素

我只三方功能如何使用網絡數據 block 性問題網絡屬性前面介紹完了如何創建網絡數據集、如何使用網絡分析功能，當然還有的讀者會迷惑於一些更深層次的問題，比如網絡數據集的連通性問題等。因為不可能面面俱到，我只能挑重點來闡述，我覺得網絡數據集的連通性、屬性和轉

使用tfrecord建立自己的數據集

相關推薦