1. 程式人生 > >nltp APP-分析買家評論的評分-高頻詞:二維關系

nltp APP-分析買家評論的評分-高頻詞:二維關系

dir yellow imp font direct let swe nco lec

w

# -*- coding: utf-8 -*-
from nltk  import *

# TO FIX : No such file or directory
os.chdir(rE:\zpy)

f = open(reviews_text_lt_3.txt, r)
f_r = f.read()
strList = f_r.split( )
fdist1 = FreqDist(strList)
#總的詞數
print fdist1
#表達式 keys()為我們提供了文本中所有不同類型的鏈表
vocabulary1 = fdist1.keys()
#通過切片看看這個鏈表的前 50 項
res0_50 =vocabulary1[:50] print res0_50

C:\>python E:\zpy\wltp.py
<FreqDist with 16789 samples and 180043 outcomes>
[‘‘, raining, disappointing.It, uncomfortable..., "lot‘s", uv.\nSo,, yellow, Seller, four, vaporizers.I, Does, completely!!, hanging, Monday,, asap!!This
, Until, instead.The, malfunctioned., Lately, looking, LAST, eligible, electricity, DISAPPOINTED, oneWorks, powdery, unanswered, also., refunsooooo, foul, on\nafter, fingers., advice:, fingers,, advice?, each),, month.I] C:\>

SELECT
    amz_review_text
FROM amz_reviews_grab_us WHERE amz_review_rating < 3 LIMIT 3000;

對於通過亞馬遜us美國站的買家而言,在數據庫前3000條的時間周期y-m-d內,在不考慮品類、價格、評分相對值等因素的情況下,

暫得出以下推測:
0-賣品屬性為yellow,其他條件相同情況下,可能不受歡迎,評分相對低;
1-周一可能會給買家糟糕的購買體驗,周一的促銷活動須結合其他因素,如人文風俗、新聞事件慎重;
註:dev的當前視角

nltp APP-分析買家評論的評分-高頻詞:二維關系