1. 程式人生 > >百度語音識別API的使用樣例(python實現)

百度語音識別API的使用樣例(python實現)

百度給的樣例程式,不論C還是Java版,都分為method1和method2兩種

前者稱為隱式(post的是json串,音訊資料編碼到json裡),後者稱為顯式(post的就是音訊資料)

一開始考慮到python wave包處理的都是“字串”,擔心跟C語言的陣列不一致,所以選擇低效但保險的method1,

即先將音訊資料base64編碼,再加上取樣率、通道數等資訊彙集成dict,最後總體編碼成json串

結果老是報:

3300 輸入引數不正確

先後試過urllib2和pycurl包,都是上面情況

不得已換用method2,成功(看來wave包對音訊的儲存並不是“字串”)

#encoding=utf-8

import wave
import urllib, urllib2, pycurl
import base64
import json
## get access token by api key & secret key

def get_token():
    apiKey = "xxxxxxxx"
    secretKey = "xxxxxxxxx"

    auth_url = "https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=" + apiKey + "&client_secret=" + secretKey;

    res = urllib2.urlopen(auth_url)
    json_data = res.read()
    return json.loads(json_data)['access_token']

def dump_res(buf):
    print buf


## post audio to server
def use_cloud(token):
    fp = wave.open('vad_0.wav', 'rb')
    nf = fp.getnframes()
    f_len = nf * 2
    audio_data = fp.readframes(nf)

    cuid = "xxxxxxxxxx" #my xiaomi phone MAC
    srv_url = 'http://vop.baidu.com/server_api' + '?cuid=' + cuid + '&token=' + token
    http_header = [
        'Content-Type: audio/pcm; rate=8000',
        'Content-Length: %d' % f_len
    ]

    c = pycurl.Curl()
    c.setopt(pycurl.URL, str(srv_url)) #curl doesn't support unicode
    #c.setopt(c.RETURNTRANSFER, 1)
    c.setopt(c.HTTPHEADER, http_header)   #must be list, not dict
    c.setopt(c.POST, 1)
    c.setopt(c.CONNECTTIMEOUT, 30)
    c.setopt(c.TIMEOUT, 30)
    c.setopt(c.WRITEFUNCTION, dump_res)
    c.setopt(c.POSTFIELDS, audio_data)
    c.setopt(c.POSTFIELDSIZE, f_len)
    c.perform() #pycurl.perform() has no return val

if __name__ == "__main__":
    token = get_token()
    use_cloud(token)

執行結果
{"corpus_no":"6150045491002357923","err_msg":"success.","err_no":0,"result":["播放小蘋果,"],"sn":"243903724071431919050"}