1. 程式人生 > >期末綜合大作業:詞頻統計

期末綜合大作業:詞頻統計

ace 技術 分享 nco IV style txt lam bubuko

#1.
bigFile = open(big.txt,mode=r,encoding=utf-8)
bigText=bigFile.read()
bigFile.close()
print(bigText)

#2.
replaceList=[,,.,"",\n]
for c in replaceList:
    bigText=bigText.replace(c,‘‘)
print(bigText)
bigText=bigText.replace(‘‘,‘‘)

#3.
print(bigText.split( ))
bigList = bigText.split(
) #4 bigSet=set(bigList) print(bigList) bigDict={} for word in bigSet: bigDict[word]=bigList.count(word) print(bigDict) for d in bigDict: print(d,bigDict[d]) #5. wordCountList=list(bigDict.items()) print(wordCountList) wordCountList.sort(key=lambda x:x[1],reverse=True) print(wordCountList) #
6. for i in range(20): print(wordCountList[i]) #7. bigCountFile=open(bigCount.txt,mode=a,encoding=utf-8) for i in range(len(wordCountList)): bigCountFile.write(str(wordCountList[i][1])+‘‘+wordCountList[i][0]+\n) bigCountFile.close()

技術分享圖片

期末綜合大作業:詞頻統計