《深度學習——Andrew Ng》第四課第三週程式設計作業

阿新 • • 發佈：2019-01-11

第三週的課程是目標檢測 ，程式設計作業是以yolo網路為主。程式設計作業的主要部分是對yolo網路輸出進行
anchor boxes過濾、IOU過濾、非極大抑制處理。

理論知識

交併比（Intersection-over-Union，IoU），目標檢測中使用的一個概念，是產生的候選框（candidate bound）與原標記框（ground truth bound）的交疊率，即它們的交集與並集的比值。IOU值越大，說明得到的候選框越準確，最理想情況是完全重疊，即比值為1。

IOU

這裡寫圖片描述

anchor boxes

anchor boxes（候選區域）
這裡寫圖片描述

非極大抑制

非極大抑制，在檢測的時候可能存在重疊的備選框，對於非極大可能性的被選框進行拋棄。
如果是多目標檢測，就進行多次非極大抑制。
這裡寫圖片描述

從演算法中可以看到，做了兩步discard：
第一步是對Pc <=0.6 的格子進行discard
第二步是對剩下的格子中，重疊區域 IOU>0.5 的進行非極大discard
這裡寫圖片描述

程式

pycharm版

原本程式設計作業是 jupyter notebook ，但是環境沒配好，所以這裡使用pycharm將程式進行整合、除錯。

import argparse
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
import scipy.io
import 
 scipy.misc
import numpy as np
import pandas as pd
import PIL
import tensorflow as tf
from keras import backend as K
from keras.layers import Input, Lambda, Conv2D
from keras.models import load_model, Model
from yolo_utils import read_classes, read_anchors, generate_colors, preprocess_image, draw_boxes, scale_boxes
from 
 yad2k.models.keras_yolo import yolo_head, yolo_boxes_to_corners, preprocess_true_boxes, yolo_loss, yolo_body

from keras.utils import plot_model


def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold=.6):
    """Filters YOLO boxes by thresholding on object and class confidence.

    Arguments:
    box_confidence -- tensor of shape (19, 19, 5, 1)
    boxes -- tensor of shape (19, 19, 5, 4)
    box_class_probs -- tensor of shape (19, 19, 5, 80)
    threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box

    Returns:
    scores -- tensor of shape (None,), containing the class probability score for selected boxes
    boxes -- tensor of shape (None, 4), containing (b_x, b_y, b_h, b_w) coordinates of selected boxes
    classes -- tensor of shape (None,), containing the index of the class detected by the selected boxes

    Note: "None" is here because you don't know the exact number of selected boxes, as it depends on the threshold.
    For example, the actual output size of scores would be (10,) if there are 10 boxes.
    """

    # Step 1: Compute box scores
    ### START CODE HERE ### (≈ 1 line)
    box_scores = box_confidence * box_class_probs       # 每個cell裡面各種分類的概率
    ### END CODE HERE ###

    # Step 2: Find the box_classes thanks to the max box_scores, keep track of the corresponding score
    ### START CODE HERE ### (≈ 2 lines)
    box_classes = K.argmax(box_scores, axis=-1)         # 各種分類概率裡面找到值最大的
    box_class_scores = K.max(box_scores, axis=-1, keepdims=False)       # 值最大的分類的概率，即score
    ### END CODE HERE ###

    # Step 3: Create a filtering mask based on "box_class_scores" by using "threshold". The mask should have the
    # same dimension as box_class_scores, and be True for the boxes you want to keep (with probability >= threshold)
    ### START CODE HERE ### (≈ 1 line)
    filtering_mask = box_class_scores >= threshold      # mask，用與遮蔽threshold值以下的cell
    ### END CODE HERE ###

    # Step 4: Apply the mask to scores, boxes and classes
    ### START CODE HERE ### (≈ 3 lines)
    scores = tf.boolean_mask(box_class_scores, filtering_mask)
    boxes = tf.boolean_mask(boxes, filtering_mask)
    classes = tf.boolean_mask(box_classes, filtering_mask)
    ### END CODE HERE ###

    return scores, boxes, classes


def iou(box1, box2):
    """Implement the intersection over union (IoU) between box1 and box2

    Arguments:
    box1 -- first box, list object with coordinates (x1, y1, x2, y2)
    box2 -- second box, list object with coordinates (x1, y1, x2, y2)
    """

    # Calculate the (y1, x1, y2, x2) coordinates of the intersection of box1 and box2. Calculate its Area.
    ### START CODE HERE ### (≈ 5 lines)
    xi1 = max(box1[0], box2[0])
    yi1 = max(box1[1], box2[1])
    xi2 = min(box1[2], box2[2])
    yi2 = min(box1[3], box2[3])
    inter_area = (yi2 - yi1) * (xi2 - xi1)
    ### END CODE HERE ###

    # Calculate the Union area by using Formula: Union(A,B) = A + B - Inter(A,B)
    ### START CODE HERE ### (≈ 3 lines)
    box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1])
    box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1])
    union_area = box1_area + box2_area - inter_area
    ### END CODE HERE ###

    # compute the IoU
    ### START CODE HERE ### (≈ 1 line)
    iou = inter_area / union_area
    ### END CODE HERE ###

    return iou


def yolo_non_max_suppression(scores, boxes, classes, max_boxes=10, iou_threshold=0.5):
    """
    Applies Non-max suppression (NMS) to set of boxes

    Arguments:
    scores -- tensor of shape (None,), output of yolo_filter_boxes()
    boxes -- tensor of shape (None, 4), output of yolo_filter_boxes() that have been scaled to the image size (see later)
    classes -- tensor of shape (None,), output of yolo_filter_boxes()
    max_boxes -- integer, maximum number of predicted boxes you'd like
    iou_threshold -- real value, "intersection over union" threshold used for NMS filtering

    Returns:
    scores -- tensor of shape (, None), predicted score for each box
    boxes -- tensor of shape (4, None), predicted box coordinates
    classes -- tensor of shape (, None), predicted class for each box

    Note: The "None" dimension of the output tensors has obviously to be less than max_boxes. Note also that this
    function will transpose the shapes of scores, boxes, classes. This is made for convenience.
    """

    max_boxes_tensor = K.variable(max_boxes, dtype='int32')  # tensor to be used in tf.image.non_max_suppression()
    K.get_session().run(tf.variables_initializer([max_boxes_tensor]))  # initialize variable max_boxes_tensor

    # Use tf.image.non_max_suppression() to get the list of indices corresponding to boxes you keep
    ### START CODE HERE ### (≈ 1 line)
    nms_indices = tf.image.non_max_suppression(boxes=boxes, scores=scores, max_output_size=max_boxes, iou_threshold=iou_threshold)
    ### END CODE HERE ###

    # Use K.gather() to select only nms_indices from scores, boxes and classes
    ### START CODE HERE ### (≈ 3 lines)
    scores = K.gather(scores, nms_indices)
    boxes = K.gather(boxes, nms_indices)
    classes = K.gather(classes, nms_indices)
    ### END CODE HERE ###

    return scores, boxes, classes


def yolo_eval(yolo_outputs, image_shape=(720., 1280.), max_boxes=10, score_threshold=.6, iou_threshold=.5):
    """
    Converts the output of YOLO encoding (a lot of boxes) to your predicted boxes along with their scores, box coordinates and classes.

    Arguments:
    yolo_outputs -- output of the encoding model (for image_shape of (608, 608, 3)), contains 4 tensors:
                    box_confidence: tensor of shape (None, 19, 19, 5, 1)
                    box_xy: tensor of shape (None, 19, 19, 5, 2)
                    box_wh: tensor of shape (None, 19, 19, 5, 2)
                    box_class_probs: tensor of shape (None, 19, 19, 5, 80)
    image_shape -- tensor of shape (2,) containing the input shape, in this notebook we use (608., 608.) (has to be float32 dtype)
    max_boxes -- integer, maximum number of predicted boxes you'd like
    score_threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box
    iou_threshold -- real value, "intersection over union" threshold used for NMS filtering

    Returns:
    scores -- tensor of shape (None, ), predicted score for each box
    boxes -- tensor of shape (None, 4), predicted box coordinates
    classes -- tensor of shape (None,), predicted class for each box
    """

    ### START CODE HERE ###

    # Retrieve outputs of the YOLO model (≈1 line)
    box_confidence, box_xy, box_wh, box_class_probs = yolo_outputs

    # Convert boxes to be ready for filtering functions
    boxes = yolo_boxes_to_corners(box_xy, box_wh)

    # Use one of the functions you've implemented to perform Score-filtering with a threshold of score_threshold (≈1 line)
    scores, boxes, classes = yolo_filter_boxes(box_confidence=box_confidence, boxes=boxes, box_class_probs=box_class_probs, threshold=score_threshold)

    # Scale boxes back to original image shape.
    boxes = scale_boxes(boxes, image_shape)

    # Use one of the functions you've implemented to perform Non-max suppression with a threshold of iou_threshold (≈1 line)
    scores, boxes, classes = yolo_non_max_suppression(scores=scores, boxes=boxes, classes=classes, iou_threshold=iou_threshold)

    ### END CODE HERE ###

    return scores, boxes, classes


def predict(sess, image_file):
    """
    Runs the graph stored in "sess" to predict boxes for "image_file". Prints and plots the preditions.

    Arguments:
    sess -- your tensorflow/Keras session containing the YOLO graph
    image_file -- name of an image stored in the "images" folder.

    Returns:
    out_scores -- tensor of shape (None, ), scores of the predicted boxes
    out_boxes -- tensor of shape (None, 4), coordinates of the predicted boxes
    out_classes -- tensor of shape (None, ), class index of the predicted boxes

    Note: "None" actually represents the number of predicted boxes, it varies between 0 and max_boxes.
    """

    # Preprocess your image
    image, image_data = preprocess_image("images/" + image_file, model_image_size=(608, 608))

    # Run the session with the correct tensors and choose the correct placeholders in the feed_dict.
    # You'll need to use feed_dict={yolo_model.input: ... , K.learning_phase(): 0})
    ### START CODE HERE ### (≈ 1 line)
    out_scores, out_boxes, out_classes = sess.run([scores, boxes, classes],
                                                  feed_dict={
                                                      yolo_model.input: image_data,
                                                      K.learning_phase(): 0
                                                  })
    ### END CODE HERE ###

    # Print predictions info
    print('Found {} boxes for {}'.format(len(out_boxes), image_file))
    # Generate colors for drawing bounding boxes.
    colors = generate_colors(class_names)
    # Draw bounding boxes on the image file
    draw_boxes(image, out_scores, out_boxes, out_classes, class_names, colors)
    # Save the predicted bounding box on the image
    image.save(os.path.join("out", image_file), quality=90)
    # Display the results in the notebook
    output_image = scipy.misc.imread(os.path.join("out", image_file))
    imshow(output_image)

    return out_scores, out_boxes, out_classes


if __name__ == '__main__':

    sess = K.get_session()
    class_names = read_classes("model_data/coco_classes.txt")
    anchors = read_anchors("model_data/yolo_anchors.txt")
    image_shape = (720.0, 1280.0)
    yolo_model = load_model("model_data/yolo.h5")
    yolo_model.summary()

    yolo_outputs = yolo_head(yolo_model.output, anchors, len(class_names))
    scores, boxes, classes = yolo_eval(yolo_outputs, image_shape)

    print(yolo_outputs)
    print("scores:", scores)
    print("boxes:", boxes)
    print("classes:", classes)

    plot_model(yolo_model, to_file='yolo.png', show_shapes=True)
    '''
    pics = []
    for root, dirs, files in os.walk("images/"):
        for file in files:
            if os.path.splitext(file)[1] == '.jpg':  # 其中os.path.splitext()函式將路徑拆分為檔名+副檔名
                pics.append(file)

    for pic in pics:
        out_scores, out_boxes, out_classes = predict(sess, pic)

    '''
    print("END!")

結果

可以看到用這個模型預測圖片的結果，並沒有特別準確，好多漏掉的car。

這裡寫圖片描述

注意問題

yolo.h5

主程式中有載入yolo模型的語句
yolo_model = load_model("model_data/yolo.h5")
檔案中是沒有這個檔案的，處理方式是自己生成，或者自己下載，這裡提供一個下載地址：yolo.h5

github檔案過大

yolo.h5這個檔案100M+，超過了GitHub允許上傳的最大檔案大小，所以git push前一定不要把這個檔案放在裡面，否則很難處理。。

如果已經把較大檔案 git push 了，這裡提供兩種方法解決：
1、處理GitHub不允許上傳大於100M檔案問題，筆者嘗試了，並沒解決問題。
2、刪除整個本地庫，從遠端再次clone該庫，然後新增新檔案（注意刪除大檔案），再次新增，提交。

《深度學習——Andrew Ng》第四課第三週程式設計作業

理論知識

IOU

anchor boxes

非極大抑制

程式

pycharm版

結果

注意問題

yolo.h5

github檔案過大

《深度學習——Andrew Ng》第一課第四周程式設計作業

《深度學習——Andrew Ng》第一課第二週程式設計作業

《深度學習——Andrew Ng》第四課第四周程式設計作業_2_神經網路風格遷移

《深度學習——Andrew Ng》第四課第二週程式設計作業

《深度學習——Andrew Ng》第四課第三週程式設計作業

Ng深度學習課程-第四課第三週筆記摘要

Ng深度學習課程-第四課第四周筆記摘要

《深度學習——Andrew Ng》第五課第一週程式設計作業_2_dinosaurus island

《深度學習——Andrew Ng》第五課第三週程式設計作業_2_Trigger+word+detection

《深度學習——Andrew Ng》第五課第三週程式設計作業_1_Machine Translation

吳恩達《深度學習》第四課第三週

深度學習第四課第三週

吳恩達《深度學習》第四課第四周

深度學習-吳恩達第一課第四周課程作業

網易雲深度學習第一課第三週程式設計作業

深度學習-吳恩達第一課第三週課程作業

deep learming 吳恩達第四課第三週程式設計 Autonomous driving application - Car detection - v3

deep learning 吳恩達第四課第四周卷積神經網路：Face Recognition for the Happy House - v3

Coursera 吳恩達《神經網路與深度學習》第三週程式設計作業

機器學習 | 吳恩達機器學習第三週程式設計作業(Python版)

《深度學習——Andrew Ng》第四課第三週程式設計作業

理論知識

IOU

anchor boxes

非極大抑制

程式

pycharm版

結果

注意問題

yolo.h5

github檔案過大

相關推薦