任務：文字檢測(可以檢測傾斜文字)

contributions
- 提出了End-to-End 的全卷積網路來解決文字檢測問題
- 可以根據特定的應用生成quardrangles或者是rotated boxes兩種格式的幾何標註
- 改進了state-of-the-art方法
演算法的核心思想：主要思想來自於U-Net, 採用U型結構來得到1、pixel-level的分割預測結果。2、pixel-level的幾何預測結果。根據1和2的結果可以計算得到每個bounding box的四個頂點的座標值。然後再通過NMS將多餘，重複的bounding box刪除。

演算法流程：
- 訓練階段
- 測試階段
- 其中基於ResNet的U-Net網路結構如下圖所示
演算法詳情：
- 如何計算ground truth?
  - score對應的ground truth: 是將原始的bounding box按照短邊長度r向內收縮了0.3r的距離。其實不太懂為什麼要做這一步操作，是為了去除噪聲嗎？
  - geometry 對應的ground truth：我們這裡以RBOX型別的資料為例，如下圖所示。針對bounding box內部的每個點，我們計算他們到上下左右四個邊的距離，並且計算角度。針對bounding box外部的點，我們將其ground truth置為0。
- Loss函式？
  - 針對score: 我們使用的是balanced cross-entropy。這樣可以平衡正負樣本不平衡的影響。其定義如下所示。實現程式碼如下所示：
    
    $L_{s} = b a l a n c e d - x e n t (\hat{Y}, Y^{*}) = - β Y^{*} l o g (\hat{Y}) - (1 - β) (1 - Y^{*}) l o$

def cross_entropy(y_true_cls, y_pred_cls, training_mask): ''' :param y_true_cls: numpy array :param y_pred_cls: numpy array :param training_mask: numpy array :return: ''' # eps = 1e-10 # y_pred_cls = y_pred_cls * training_mask + eps # y_true_cls = y_true_cls * training_mask + eps # shape = list(np.shape(y_true_cls)) # beta = 1 - (np.sum(np.reshape(y_true_cls, [shape[0], -1]), axis=1) / (1.0 * shape[1] * shape[2])) # cross_entropy_loss = -beta * y_true_cls * np.log(y_pred_cls) - (1 - beta) * (1 - y_true_cls) * np.log( # 1 - y_pred_cls) # return np.mean(cross_entropy_loss) eps = 1e-10 y_pred_cls = y_pred_cls * training_mask + eps y_true_cls = y_true_cls * training_mask + eps each_y_true_sample = tf.split(y_true_cls, num_or_size_splits=FLAGS.batch_size_per_gpu, axis=0) each_y_pred_sample = tf.split(y_pred_cls, num_or_size_splits=FLAGS.batch_size_per_gpu, axis=0) loss = None for i in range(FLAGS.batch_size_per_gpu): cur_true = each_y_true_sample[i] cur_pred = each_y_pred_sample[i] beta = 1 - (tf.reduce_sum(cur_true) / (FLAGS.input_size * FLAGS.input_size)) cur_loss = -beta * cur_true * tf.log(cur_pred) - (1-beta) * (1-cur_true) * tf.log((1-cur_pred)) if loss is None: loss = cur_loss else: loss = loss + cur_loss return tf.reduce_mean(loss)

【論文閱讀】EAST: An Efficient and Accurate Scene Text Detector

任務：文字檢測(可以檢測傾斜文字)

【論文閱讀】EAST: An Efficient and Accurate Scene Text Detector

EAST: An Efficient and Accurate Scene Text Detector

OCR EAST: An Efficient and Accurate Scene Text Detector 自然場景下的文字識別演算法詳解

EAST 自然場景文字檢測實踐(EAST: An Efficient and Accurate Scene Text Detector)

【論文閱讀】Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel......

【論文閱讀】韓鬆《Efficient Methods And Hardware For Deep Learning》節選《Learning both Weights and Connections 》

【論文閱讀】Slot-Gated Modeling for Joint Slot Filling and Intent Prediction

【論文閱讀】《Delta TFIDF：An Improved Feature Space for Sentiment Analysis》（論文及實驗）

【論文閱讀】Author2Vec: Learning Author Representations by Combining Content and Link Information

【論文閱讀】Batch Feature Erasing for Person Re-identification and Beyond

【論文閱讀】Long-Term Recurrent Convolutional Networks for Visual Recognition and Description

【論文閱讀】Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

【論文閱讀】Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

【論文閱讀】【ICLR 2017】SqueezeNet AlexNet-level accuracy with 50x fewer parameters and 0.5MB model size

【論文閱讀】Siamese Neural Networks for One-shot Image Recognition

【論文閱讀】The Ubuntu Dialogue Corpus

【論文閱讀】Sequence to Sequence Learning with Neural Networks

【論文閱讀】Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

【論文閱讀】Clustering Convolutional Kernels to Compress Deep Neural Networks

【論文閱讀】Between-class Learning for Image Classification

【論文閱讀】EAST: An Efficient and Accurate Scene Text Detector

任務：文字檢測(可以檢測傾斜文字)

相關推薦