1. 程式人生 > >綜合練習:英文詞頻統計

綜合練習:英文詞頻統計

ldr one Go 處理 AC 空格 sorted 意義 spl

  1. 詞頻統計預處理
  2. 下載一首英文的歌詞或文章
  3. 將所有,.?!’:等分隔符全部替換為空格
  4. 將所有大寫轉換為小寫
  5. 生成單詞列表
  6. 生成詞頻統計
  7. 排序
  8. 排除語法型詞匯,代詞、冠詞、連詞
  9. 輸出詞頻最大TOP10
song = ‘‘‘
If you say you’re the firework at the bay

I wish I could be a wave

after the rain, you light up the gray

far away you’re the galaxy from space

with the stars you kiss my face

I’ll go everywhere after your trace

when I’m lonely l willearntoembrace

I’ll follow you along the way

like shadow chasing down the flame

I’ll wait for you right on your way

come and stay with me if you may

I’ll raise my head and look your way

tears dropping down and feeling free

Some love comes by like hurricane

as if I play your losing game

If you’re like firefly in summer haze

Children laugh around your grace

Then I’ll be there, trying to say out your name

Look at me, what a tiny helpless me

Only dream when you smile at me

Maybe you wouldn’t stop just for me

Far behind let me stand there singing

I’ll follow you along the way

like shadow chasing down the flame

I’ll wait for you right on your way

come and stay with me if you may

I’ll raise my head and look your way

tears dropping down and feeling free

Some love comes by like hurricane

but rainbows rise

I’ll follow you along the way

like shadow chasing down the flame

I’ll wait for you right on your way

come and stay with me if you may

I’ll raise my head and look your way

tears dropping down and feeling free

Some love comes by like hurricane

but rainbows rise after the pain
‘‘‘

#將所有分隔符全部替換為空格,將所有大寫轉換為小寫,以空格劃分每個單詞 s1 = song.replace(, ).lower().split() s2 = song.split() #統計各單詞出現的次數 c = {} for i in s2: count = s1.count(i) c[i] = count #去掉沒意義的單詞 word = ‘‘‘ i you you‘re the by up a but my and would when some i‘ll i‘m with on could come from Maybe only out me in at for if your down
‘‘‘ s3 = word.split() for i in s3: if i in c.keys(): del (c[i])
#按每個單詞出現的次數進行排序 count = sorted(c.items(),key=lambda items: items[1], reverse=True) #輸出詞頻最大TOP10 for i in range(10): print(count[i])

綜合練習:英文詞頻統計