文字分類之CNN模型（TensorFlow實現版本）

阿新 • • 發佈：2018-12-02

前言

最近在琢磨文字分類相關的深度學習模型，也研讀了以下三篇使用卷積神經網路CNN實現的文字分類論文：
（1）《Convolutional Neural Networks for Sentence Classification》
（2）《Character-level Convolutional Networks for Text Classification》
（3）《Effective Use of Word Order for Text Categorization with Convolutional Neural Networks》
此部落格也有對一些文字分類論文思路進行講解：

https://blog.csdn.net/guoyuhaoaaa/article/details/53188918

模型實現

這幾天主要實現了第一篇論文的CNN模型，使用了20newsgroup的資料集，實現三個模型如下：

CNN-rand
句子中的的word vector都是隨機初始化的，同時當做CNN訓練過程中需要優化的引數；
CNN-static
句子中的word vector是使用word2vec預先對Google News dataset (about 100 billion words)進行訓練好的詞向量表中的詞向量。且在CNN訓練過程中作為固定的輸入，不作為優化的引數;

CNN-nonstatic
句子中的word vector是使用word2vec預先對Google News dataset (about 100 billion words)進行訓練好的詞向量表中的詞向量。在CNN訓練過程中作為固定的輸入，做為CNN訓練過程中需要優化的引數；

整體思路如圖所示（摘自論文1）：
此處輸入圖片的描述
包括以下幾個部分：
* 輸入層
* 卷積層
抽取Feature Map，也就是我們所需的文字特徵
* 全連線層
通過Max-pooling操作，即將每個Feature Map向量中最大的一個值抽取出來,組成一個一維向量
* 輸出層
該層的輸入為池化操作後形成的一維向量，經過啟用函式ReLU

輸出，再加上Dropout層防止過擬合。並在全連線層上新增l2正則化引數

更詳細的講解可以看這篇文章：https://www.jianshu.com/p/fe428f0b32c1

部分程式碼

具體的程式碼以及我踩過的坑可以看我的github：
https://github.com/DilicelSten/CNN_learning/blob/master/simple%20cnn/

import tensorflow as tf


# Embedding layer
with tf.device('/cpu:0'), tf.name_scope("embedding"):
    self.W = tf.Variable(
    tf.random_uniform([vocab_size, embedding_size], -0.25, 0.25),
                name="W")
    self.embedded_chars = tf.nn.embedding_lookup(self.W, self.input_x)
    self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)

# create a convolution + maxpool layer for each fliter size
pooled_outputs = []
for i, filter_size in enumerate(filter_sizes):
    with tf.name_scope("conv-maxpool-%s" % filter_size):
    # convolution layer
    filter_shape = [filter_size, embedding_size, 1, num_filters]
    W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")
    b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b")
    conv = tf.nn.conv2d(
            self.embedded_chars_expanded,
            W,
            strides=[1, 1, 1, 1],
            padding="VALID",
            name="conv")

# apply nonlinearity
h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")

# maxpooling over the outputs
pooled = tf.nn.max_pool(
            h,
            ksize=[1, sequence_length - filter_size + 1, 1, 1],
            strides=[1, 1, 1, 1],
            padding="VALID",
            name="pool")
pooled_outputs.append(pooled)

# combine all the pooled features
num_filters_total = num_filters * len(filter_sizes)
self.h_pool = tf.concat(pooled_outputs, 3)
self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total])

# add dropout
with tf.name_scope("dropout"):
    self.h_drop = tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)

# final (unnormalized) scores and predictions
with tf.name_scope("output"):
     W = tf.get_variable(
        "W",
    shape=[num_filters_total, num_classes],
    initializer=tf.contrib.layers.xavier_initializer())
    b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")
    l2_loss += tf.nn.l2_loss(W)
    l2_loss += tf.nn.l2_loss(b)
    self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")
    self.predictions = tf.argmax(self.scores, 1, name="predictions")

文字分類之CNN模型（TensorFlow實現版本）

前言

模型實現

部分程式碼

文字分類之CNN模型（TensorFlow實現版本）

機器學習（二十）——文字分類的事件模型（Event models for text classification）

文字挖掘之特徵選擇（python實現）

即時動態定價「實做 2」 — 整合模型（附實現程式碼）

Appium+Python之PO模型（Page object Model）

終極指南：構建用於檢測汽車損壞的Mask R-CNN模型（附Python演練）

r語言做決策樹模型（少廢話版本）

c/c++ 繼承與多型文字查詢的小例子（智慧指標版本）

CNN模型和RNN模型在分類問題中的應用（Tensorflow實現）

tensorflow學習之訓練自己的CNN模型（簡單二分類）

基於RNN的文字分類模型（Tensorflow）

關於訓練深度學習模型deepNN時，訓練精度維持固定值，模型不收斂的解決辦法（tensorflow實現）

TensorFlow(四)——MNIST分類之CNN

CNN的LeNet-5模型及其TensorFlow實現

NLP --- 文字分類（向量空間模型（Vector Space Model）VSM）

文字分類需要CNN？ No！fastText完美解決你的需求（前篇）

基於NaiveBayes的文字分類之Spark實現

程式碼，邏輯迴歸(logistic_regression)實現mnist分類（TensorFlow實現）

分類-迴歸樹模型（CART）在R語言中的實現

文字處理——基於 word2vec 和 CNN 的文字分類：綜述 & 實踐（一）

文字分類之CNN模型（TensorFlow實現版本）

前言

模型實現

部分程式碼

相關推薦