SSD-Tensorflow：利用KITTI資料集進行訓練

阿新 • • 發佈：2019-01-10

# Copyright 2015 Paul Balanca. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Converts KITTI data to TFRecords file format with Example protos.

The raw Pascal VOC data set is expected to reside in JPEG files located in the
directory 'image_2'. Similarly, bounding box annotations are supposed to be
stored in the 'label_2'

This TensorFlow script converts the training and validation data into
a sharded data set consisting of 1024 and 128 TFRecord files, respectively.

Each validation TFRecord file contains ~500 records. Each training TFREcord
file contains ~1000 records. Each record within the TFRecord file is a
serialized Example proto. The Example proto contains the following fields:

    image/encoded: string containing PNG encoded image in RGB colorspace
    image/height: integer, image height in pixels
    image/width: integer, image width in pixels
    image/channels: integer, specifying the number of channels, always 3
    image/format: string, specifying the format, always'PNG'

    image/object/bbox/xmin: list of float specifying the 0+ human annotated
        bounding boxes
    image/object/bbox/xmax: list of float specifying the 0+ human annotated
        bounding boxes
    image/object/bbox/ymin: list of float specifying the 0+ human annotated
        bounding boxes
    image/object/bbox/ymax: list of float specifying the 0+ human annotated
        bounding boxes
    image/object/bbox/label: list of integer specifying the classification index.
    image/object/bbox/label_text: list of string descriptions.

Note that the length of xmin is identical to the length of xmax, ymin and ymax
for each example.
"""
import os
import os.path
import sys
import random

import numpy as np
import tensorflow as tf

from datasets.dataset_utils import int64_feature, float_feature, bytes_feature
from datasets.kitti_common import KITTI_LABELS

DEFAULT_IMAGE_DIR = 'image_2/'
DEFAULT_LABEL_DIR = 'label_2/'

# TFRecords convertion parameters.
RANDOM_SEED = 4242
SAMPLES_PER_FILES = 512

def _png_image_shape(image_data, sess, decoded_png, inputs):
    rimg = sess.run(decoded_png, feed_dict={inputs: image_data})
    return rimg.shape


def _process_image(directory, name, f_png_image_shape,
                   image_dir=DEFAULT_IMAGE_DIR, label_dir=DEFAULT_LABEL_DIR):
    """Process a image and annotation file.

    Args:
      directory: KITTI dataset directory;
      name: file name.
    Returns:
      image_buffer: string, JPEG encoding of RGB image.
      height: integer, image height in pixels.
      width: integer, image width in pixels.
    """
    # Read the PNG image file.
    filename = os.path.join(directory, image_dir, name + '.png')
    image_data = tf.gfile.FastGFile(filename, 'rb').read()
    shape = list(f_png_image_shape(image_data))

    # Get object annotations.
    labels = []
    labels_text = []
    truncated = []
    occluded = []
    alpha = []
    bboxes = []
    dimensions = []
    locations = []
    rotation_y = []

    # Read the txt label file, if it exists.
    filename = os.path.join(directory, label_dir, name + '.txt')
    if os.path.exists(filename):
        with open(filename) as f:
            label_data = f.readlines()
        for l in label_data:
            data = l.split()
            if len(data) > 0:
                # Label.
                labels.append(int(KITTI_LABELS[data[0]][0]))
                labels_text.append(data[0].encode('ascii'))
                # truncated, occluded and alpha.
                truncated.append(float(data[1]))
                occluded.append(int(data[2]))
                alpha.append(float(data[3]))
                # bbox.
                bboxes.append((float(data[4]) / shape[1],
                               float(data[5]) / shape[0],
                               float(data[6]) / shape[1],
                               float(data[7]) / shape[0]
                               ))
                # 3D dimensions.
                dimensions.append((float(data[8]),
                                   float(data[9]),
                                   float(data[10])
                                   ))
                # 3D location and rotation_y.
                locations.append((float(data[11]),
                                  float(data[12]),
                                  float(data[13])
                                  ))
                rotation_y.append(float(data[14]))

    return (image_data, shape, labels, labels_text, truncated, occluded,
            alpha, bboxes, dimensions, locations, rotation_y)


def _convert_to_example(image_data, shape, labels, labels_text,
                        truncated, occluded, alpha, bboxes,
                        dimensions, locations, rotation_y):
    """Build an Example proto for an image example.

    Args:
      image_data: string, PNG encoding of RGB image;
      labels: list of integers, identifier for the ground truth;
      labels_text: list of strings, human-readable labels;
      bboxes: list of bounding boxes; each box is a list of integers;
          specifying [xmin, ymin, xmax, ymax]. All boxes are assumed to belong
          to the same label as the image label.
      shape: 3 integers, image shapes in pixels.
    Returns:
      Example proto
    """
    # Transpose bboxes, dimensions and locations.
    bboxes = list(map(list, zip(*bboxes)))
    dimensions = list(map(list, zip(*dimensions)))
    locations = list(map(list, zip(*locations)))
    # Iterators.
    it_bboxes = iter(bboxes)
    it_dims = iter(dimensions)
    its_locs = iter(locations)

    image_format = b'PNG'
    example = tf.train.Example(features=tf.train.Features(feature={
            'image/height': int64_feature(shape[0]),
            'image/width': int64_feature(shape[1]),
            'image/channels': int64_feature(shape[2]),
            'image/shape': int64_feature(shape),
            'image/format': bytes_feature(image_format),
            'image/encoded': bytes_feature(image_data),
            'object/label': int64_feature(labels),
            'object/label_text': bytes_feature(labels_text),
            'object/truncated': float_feature(truncated),
            'object/occluded': int64_feature(occluded),
            'object/alpha': float_feature(alpha),
            'object/bbox/xmin': float_feature(next(it_bboxes, [])),
            'object/bbox/ymin': float_feature(next(it_bboxes, [])),
            'object/bbox/xmax': float_feature(next(it_bboxes, [])),
            'object/bbox/ymax': float_feature(next(it_bboxes, [])),
            'object/dimensions/height': float_feature(next(it_dims, [])),
            'object/dimensions/width': float_feature(next(it_dims, [])),
            'object/dimensions/length': float_feature(next(it_dims, [])),
            'object/location/x': float_feature(next(its_locs, [])),
            'object/location/y': float_feature(next(its_locs, [])),
            'object/location/z': float_feature(next(its_locs, [])),
            'object/rotation_y': float_feature(rotation_y),
            }))
    return example


def _add_to_tfrecord(dataset_dir, name, tfrecord_writer, f_png_image_shape,
                     image_dir=DEFAULT_IMAGE_DIR, label_dir=DEFAULT_LABEL_DIR):
    """Loads data from image and annotations files and add them to a TFRecord.

    Args:
      dataset_dir: Dataset directory;
      name: Image name to add to the TFRecord;
      tfrecord_writer: The TFRecord writer to use for writing.
    """
    l_data = _process_image(dataset_dir, name, f_png_image_shape,
                            image_dir, label_dir)
    example = _convert_to_example(*l_data)
    tfrecord_writer.write(example.SerializeToString())

'''
def _get_output_filename(output_dir, name):
    return '%s/%s.tfrecord' % (output_dir, name)
'''
def _get_output_filename(output_dir, name, idx):
    return '%s/%s_%03d.tfrecord' % (output_dir, name, idx)


def run(dataset_dir, output_dir, name='kitti_train', shuffling=False):
    """Runs the conversion operation.

    Args:
      dataset_dir: The dataset directory where the dataset is stored.
      output_dir: Output directory.
    """
    if not tf.gfile.Exists(dataset_dir):
        tf.gfile.MakeDirs(dataset_dir)
    '''
    tf_filename = _get_output_filename(output_dir, name)
    if tf.gfile.Exists(tf_filename):
        print('Dataset files already exist. Exiting without re-creating them.')
        # return
    '''
    # Dataset filenames, and shuffling.
    path = os.path.join(dataset_dir, DEFAULT_IMAGE_DIR)
    filenames = sorted(os.listdir(path))
    if shuffling:
        random.seed(RANDOM_SEED)
        random.shuffle(filenames)

    # PNG decoding.
    inputs = tf.placeholder(dtype=tf.string)
    decoded_png = tf.image.decode_png(inputs)
    with tf.Session() as sess:
        fidx = 0
        i=0
        while(i<len(filenames)):
            tf_filename = _get_output_filename(output_dir, name,fidx) #獲取檔名
        #    print(tf_filename)
            # Process dataset files.
            with tf.python_io.TFRecordWriter(tf_filename) as tfrecord_writer:
                j=0
                while i < len(filenames) and j < SAMPLES_PER_FILES:
                    sys.stdout.write('\r>> Converting image %d/%d' % (i+1, len(filenames)))
                    sys.stdout.flush()
                    filename = filenames[i]
                    img_name = filename[:-4]
                    _add_to_tfrecord(dataset_dir, img_name, tfrecord_writer,
                                     lambda x: _png_image_shape(x, sess, decoded_png, inputs))
                    i += 1
                    j += 1
                fidx += 1
    '''
                for i, filename in enumerate(filenames):
                    sys.stdout.write('\r>> Converting image %d/%d' % (i+1, len(filenames)))
                    sys.stdout.flush()

                    name = filename[:-4]
                    _add_to_tfrecord(dataset_dir, name, tfrecord_writer,
                                     lambda x: _png_image_shape(x, sess, decoded_png, inputs))
                                     '''
    print('\nFinished converting the KITTI dataset!')

然後資料集程式碼部分就完成了，接下來需要改一下dataset_factory.py檔案：在末尾像我這樣加一條

SSD-Tensorflow：利用KITTI資料集進行訓練

EL之Bagging：利用DIY資料集(預留30%資料+兩種樹深)訓練Bagging演算法(DTR)

EL之Bagging：利用DIY資料集(預留30%資料+兩種樹深)訓練Bagging演算法(DTR) 輸出結果 1、treeDepth=1 2、treeDepth=5 設計思路核心程式碼 for iTre

利用mnist資料集進行深度神經網路

初始神經網路這裡要解決的問題是，將手寫數字的灰度影象（28 畫素 x28 畫素）劃分到 10 個類別中（0~9)。我們將使用 MINST 資料集，它是機器學習領域的一個經典資料集，其歷史幾乎和這個領域一樣長，而且已被人們深入研究。這個資料集包含 60000 張訓練影象和 10000 張測試影象，由美國國家

Spark中元件Mllib的學習11之使用ALS對movieLens中一百萬條（1M）資料集進行訓練，並對輸入的新使用者資料進行電影推薦

1解釋 spark-1.5.2 資料集：http://grouplens.org/datasets/movielens/ 一百萬條（1M）資料劃分：將樣本評分表以key值切分成3個部分，分別用於訓練 (60%，並加入使用者評分), 校驗 (20

製作自己的yolo2資料集進行訓練

說明本文承接上一篇修改yolo2相關配置的部落格，用來說明如何製作自己的訓練資料，。主要流程就是手動標註目標資訊了，當然，圖片首先要自己準備好。注意：本文的識別型別只有1類工具 - 畫框程式 https://github.com/puzzledqs/BBox-Lab

python,tensorflow,CNN實現mnist資料集的訓練與驗證正確率

1.工程目錄 2.匯入data和input_data.py 連結：https://pan.baidu.com/s/1EBNyNurBXWeJVyhNeVnmnA 提取碼：4nnl 3.CNN.py import tensorflow as tf import matpl

Mxnet(2)---faster-rcnn製作自己的資料集進行訓練

Mxnet自帶有faster-rcnn的例子，但是如果要用自己的資料進行訓練可能需要作一些更改，一個是類別的數目，一個數據的標籤。其實它的修改方式和py-faster-rcnn差不多。 **

網上爬取圖片製作成資料集進行訓練

一、用pthon爬取圖片如圖：建立一個資料夾，下放每一類的資料夾（我的絕對路徑是：/home/user/dataset/）在每一個class下面，建立一個test.py檔案，用以爬取圖片 # coding=utf-8 """根據搜尋詞下載百度圖片"""

Tensorflow學習教程------利用卷積神經網路對mnist資料集進行分類_利用訓練好的模型進行分類

#coding:utf-8 import tensorflow as tf from PIL import Image,ImageFilter from tensorflow.examples.tutorials.mnist import input_data def imageprepare(ar

TensorFlow深度學習實戰（一）：AlexNet對MNIST資料集進行分類

概要進來一段時間在看深度學習中經典的CNN模型相關論文。同時，為了督促自己學習TensorFlow，通讀論文之後開始，利用TensorFlow實現各個模型，復現相關實驗。這是第一篇論文講解的是AlexNet，論文下載網址為：ImageNet Classific

【自然語言處理入門】03：利用線性迴歸對資料集進行分析預測（下）

上一篇中我們簡單的介紹了利用線性迴歸分析並預測波士頓房價資料集，那麼在這一篇中，將使用相同的模型來對紅酒資料集進行分析。 1 基本要求利用線性迴歸，對紅酒資料集進行分析。資料集下載地址。 2 完整程式碼 #-*- codin

【自然語言處理入門】03：利用線性迴歸對資料集進行分析預測（上）

本篇筆記是《從自然語言處理到機器學習入門》課程第三次作業的上篇，主要是復現了老大課上講的利用線性迴歸對波士頓房價進行預測的實驗。在下篇中，將利用該模型對紅酒資料集進行線性迴歸分析。 1 基本要求利用提供的波士頓房價資料，對其進行分析。資

【自然語言處理入門】01：利用jieba對資料集進行分詞，並統計詞頻

一、基本要求使用jieba對垃圾簡訊資料集進行分詞，然後統計其中的單詞出現的個數，找到出現頻次最高的top100個詞。二、完整程式碼 # -*- coding: UTF-8 -*- fr

KITTI資料集測試：MATLAB繪製groundtruth 真實地圖

在poses目錄下，包含00.txt-10.txt 11個序列，每一個檔案包換Nx12個表格，N代表幀數。每一行利用3x4轉移矩陣代表左邊相機系統位姿，轉移矩陣將當前幀左邊相機系統中的一個點對映到第0幀的座標系統中。轉移矩陣中平移的部分表示當前相機位置(相對於第0幀)。 Groundtrut

Keras之MLP：利用MLP【Input(8)→(12)(relu)→O(sigmoid+二元交叉)】模型實現預測新資料(利用糖尿病資料集的八個特徵預測一個0或1）

Keras之MLP：利用MLP【Input(8)→(12)(relu)→O(sigmoid+二元交叉)】模型實現預測新資料(利用糖尿病資料集的八個特徵預測一個0或1）輸出結果實現程式碼 # load and prepare the dataset

Keras之DNN：利用DNN【Input(8)→(12+8)(relu)→O(sigmoid)】模型實現預測新資料(利用糖尿病資料集的八個特徵預測一個0或1）

Keras之DNN：利用DNN【Input(8)→(12+8)(relu)→O(sigmoid)】模型實現預測新資料(利用糖尿病資料集的八個特徵預測一個0或1）輸出結果 [1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0,

Keras之DNN：利用DNN演算法【Input(8)→12+8(relu)→O(sigmoid)】利用糖尿病資料集訓練、評估模型(利用糖尿病資料集中的八個引數特徵預測一個0或1結果)

Keras之DNN：利用DNN演算法【Input(8)→12+8(relu)→O(sigmoid)】利用糖尿病資料集訓練、評估模型(利用糖尿病資料集中的八個引數特徵預測一個0或1結果) 輸出結果設計思路實現程式碼 1、 2、

Tensorflow學習之路（一）：從MNIST資料集開始

MNIST資料集簡單介紹： MNIST 資料集可在 http://yann.lecun.com/exdb/mnist/ 獲取, 它包含了四個部分: Training set images: train-images-idx3-ubyte.gz (9.9 MB,

KITTI資料集測試：groundtruth 真實地圖

在poses目錄下，包含00.txt-10.txt 11個序列，每一個檔案包換Nx12個表格，N代表幀數。每一行利用3x4轉移矩陣代表左邊相機系統位姿，轉移矩陣將當前幀左邊相機系統中的一個點對映到第0幀的座標系統中。轉移矩陣中平移的部分表示當前相機位置(相對於第

Tensorflow例項1：對人工資料集的K均值聚類

2.6.7 例1–對人工資料集的K均值聚類 import tensorflow as tf import numpy as np import time import matplotlib.pyplot as plt import matplotlib from sklearn

SSD-Tensorflow：利用KITTI資料集進行訓練

相關推薦