1. 程式人生 > >如何開發一個線上朗讀的功能----科大訊飛語音合成實戰

如何開發一個線上朗讀的功能----科大訊飛語音合成實戰

-- 很久沒寫技術部落格,心血來潮,準備繼續撿起。

起因

天天學習強國,不過強國APP的語音朗讀不錯,瞭解之後是科大訊飛支援的,於是開始擼碼。https://www.xfyun.cn/doc/tts/online_tts/API.html

註冊為開發者,介面要求這些我就不贅述了,文件裡面寫的清楚。當然具體實現是另外一回事。

聽了一下效果,怎麼說呢,免費的和特色的還是有很大的差別的,免費的是剛好讓你能忍的那個級別,特色的和真人差別不大。看了一下收費,分為兩部分,一部分是介面費用,一部分是特色發音人的費用。基於擼碼的習慣,一切先從免費開始。

 

 

  詳情請看這裡:https://www.xfyun.cn/services/online_tts

 開幹

看了一圈沒有C#的demo,這就尷尬了,雖然是有文件,但是大家都懂,好比微信公眾號的開發文件,要變成實際的程式碼,看得見的應用那是要廢一番功夫的。找了一番之後,終於發現一個開源的專案剛釋出沒多久,真是喜出望外就開幹了: https://github.com/zuiyuewentian/XunFeiNETSDK

訊飛的這個介面是基於websock的,我們先用控制檯程式做一個demo。C#其實自帶了websocket,不過這裡用的是WebSocketSharp,這個我覺得很好,System.Net.WebSockets.WebSocket 是基於非同步方法的,後面我會講到,而WebSocketSharp.WebSocket 是基於事件的,很符合前端的程式設計習慣。

websocket = new WebSocketSharp.WebSocket(reqUrl);
                websocket.OnMessage += Websocket_OnMessage;
                websocket.OnOpen += Websocket_OnOpen;
                websocket.Connect();

訊飛的伺服器收到我們的文字內容後,會以流的形勢把音訊傳回來,在我們的伺服器上把這種流轉成檔案即可。

 private static Stopwatch stopwatch;
        public static void Main(string[] args)
        {
            //text要合成的文字,pathUrl域名
            stopwatch = new Stopwatch();
            stopwatch.Start();
            var xunFeiNetSdk = new XunFeiTTS();
            xunFeiNetSdk.MessageUpdate_Event += XunFeiNetSdk_MessageUpdate_Event;
            xunFeiNetSdk.SendData("張家界荷花國際機場,北京大興機場,長沙黃花機場,邵陽武岡機場,所有航班全部復航!");
            Console.Read();
        }

        static byte[] data = new byte[0];
        private static void XunFeiNetSdk_MessageUpdate_Event(TTS_Data_Model message, string error = null)
        {

            if (error != null)
            {
                Console.WriteLine(error);
                return;
            }

            try
            {
                //合成結束
                if (message.status == 2)
                {
                    Console.WriteLine("合成成功");
                    string voice = string.Format("{0}.wav", DateTime.Now.ToString("yyyyMMddHHmmssfff"));

                    Console.WriteLine("正在儲存..."+voice);
                    
                    data = data.Concat(message.audioStream).ToArray();

                    var mWavWriter = new WaveFileWriter(voice, new WaveFormat(16000, 1));
                    mWavWriter.Write(data, 0, data.Length);
                    mWavWriter.Close();
                    mWavWriter.Dispose();
                    Console.WriteLine("儲存成功...");
                    var sp = stopwatch.Elapsed;

                    Console.WriteLine("用時" + sp);


                }
                else
                {
                    data = data.Concat(message.audioStream).ToArray();
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

    }

檔案的儲存用的是NAudio,XunFeiNETSDK裡面的程式碼我獨立出來。

(最近2個月航班太少了,工資驟減,原諒我說出我的內心話) 

這樣就得到了語音了。聽一聽,還能接受。但是怎麼做到web頁面裡面呢?

改造成web應用

首先的思路是,前端把文字發過來,然後交給sdk去獲取音訊,得到檔案的地址後返回給前端。所以最合適的方案還是前端也用websocket,因為傳送訊息和收到訊息是分開的。那麼這又需要後端有一個websocket服務了

 

我又不想單獨去開一個websocket服務,那就可以將這個websocket做成api的形式,如下:

namespace HHOA.MVC5.Controllers.API
{
    [RoutePrefix("api/msg")]
    public class MsgApiController : ApiController
    {
        private static List<WebSocket> _sockets = new List<WebSocket>();
        private readonly  XunFeiTTS _xunFei;
        private WebSocket currentSocket = null;

        public MsgApiController()
        {
            _xunFei = new XunFeiTTS();
            _xunFei.MessageUpdate_Event += XunFeiNetSdk_MessageUpdate_Event;
            Logger.Info("啟動XunFeiTTS");
            
        }


        private byte[] data = new byte[0];
        private void XunFeiNetSdk_MessageUpdate_Event(TTS_Data_Model message, string error = null)
        {

            if (error != null)
            {
                Console.WriteLine(error);
                return;
            }
            WaveFileWriter mWavWriter=null;
            try
            {
                //合成結束
                if (message.status == 2)
                {
                    Logger.Info("合成成功");
                    var savePath = HostingEnvironment.MapPath("~/Files/Voice/");
                    string diff = DateTime.Now.ToString("yyyyMMddHHmmssfff");
                    string voice = string.Format("{0}.wav", diff);

                    var filePath = savePath + voice;

                    var di = new DirectoryInfo(savePath);
                    if (!di.Exists) { di.Create(); }

                    var webPath = "/Files/Voice/" + voice;


                    Logger.Info("正在儲存..." + filePath);

                    data = data.Concat(message.audioStream).ToArray();

                     mWavWriter = new WaveFileWriter(filePath, new WaveFormat(16000, 1));
                    mWavWriter.Write(data, 0, data.Length);
                    mWavWriter.Close();
                    mWavWriter.Dispose();

                    Logger.Info("儲存成功...");

                    //將音訊地址發給前端
                    if (currentSocket != null && currentSocket.State == WebSocketState.Open)
                    {
                        var recvBytes = Encoding.UTF8.GetBytes("voice:" + webPath);
                        var sendBuffer = new ArraySegment<byte>(recvBytes);
                        currentSocket.SendAsync(sendBuffer, WebSocketMessageType.Text, true, CancellationToken.None);
                    }

                }
                else
                {
                    data = data.Concat(message.audioStream).ToArray();
                }
            }
            catch (Exception ex)
            {
                if (mWavWriter != null)
                {
                    mWavWriter.Dispose();
                }
                Logger.Debug(ex.Message);
            }
        }




        [Route]
        [HttpGet]
        public HttpResponseMessage Connect()
        {
            HttpContext.Current.AcceptWebSocketRequest(ProcessRequest); //在伺服器端接受Web Socket請求,傳入的函式作為Web Socket的處理函式,待Web Socket建立後該函式會被呼叫,在該函式中可以對Web Socket進行訊息收發

            return Request.CreateResponse(HttpStatusCode.SwitchingProtocols); //構造同意切換至Web Socket的Response.
        }

        public async Task ProcessRequest(AspNetWebSocketContext context)
        {
            var socket = context.WebSocket;//傳入的context中有當前的web socket物件
            _sockets.Add(socket);//此處將web socket物件加入一個靜態列表中

            //進入一個無限迴圈,當web socket close是迴圈結束
            while (true)
            {
                var buffer = new ArraySegment<byte>(new byte[1024]);
                var receivedResult = await socket.ReceiveAsync(buffer, CancellationToken.None);//對web socket進行非同步接收資料
                if (receivedResult.MessageType == WebSocketMessageType.Close)
                {
                    await socket.CloseAsync(WebSocketCloseStatus.Empty, string.Empty, CancellationToken.None);//如果client發起close請求,對client進行ack
                    _sockets.Remove(socket);
                    break;
                }

                if (socket.State == WebSocketState.Open)
                {
                    //收到了訊息
                    string recvMsg = Encoding.UTF8.GetString(buffer.Array, 0, receivedResult.Count);
                    //將這個訊息傳送給xf
                    Logger.Info("收到訊息:"+recvMsg);
                    _xunFei.SendData(recvMsg);


                    var recvBytes = Encoding.UTF8.GetBytes(recvMsg);
                    var sendBuffer = new ArraySegment<byte>(buffer.Array);
                    currentSocket = socket;

                    await socket.SendAsync(sendBuffer, WebSocketMessageType.Text, true, CancellationToken.None);

                 
                }
            }
        }
    }

}
View Code
 var webSocket;
        var player = document.getElementById("player");
        function sendSocketMsg() {
            var msg = $("#msg").val();
            webSocket.send(msg);
            showMsg("傳送訊息:" + msg, "blue");
        }

        openSocket();

        function openSocket() {
            if (webSocket != null && typeof (webSocket) != "undefined") {
                closeSocket();
            }
            webSocket = new WebSocket("ws://" + location.hostname + ":" + location.port + "/api/msg");
            webSocket.onopen = function () {
                showMsg("連線建立");
            }
            webSocket.onerror = function () {
                showMsg("發生異常");
            }

            webSocket.onmessage = function (event) {
                showMsg("收到訊息:" + event.data, "yellow");
                if (event.data.indexOf("voice:") > -1) {
                    var src = event.data.split("voice:")[1];
                    player.src = src;
                    player.play();
                }
            }

            webSocket.onclose = function () {
                showMsg("連線關閉");
            }
        }

        function closeSocket() {
            if (webSocket != null && typeof (webSocket) != "undefined") {
                webSocket.close();
            }
        }

        function showMsg(msg, type) {
            if (type === null || typeof (type) === "undefined") type = "gray";
            $("#show").append("<span class='" + type + "'>" + msg + "</span><br>");
        }

這樣就得到產品的雛形了。後續要考慮的是文字的長短、音訊播放器的展示效果,還能換一下播放的聲音等等,每次給你說一個功能,其實這個功能背後有太多細節了。

 Console版原始碼:https://download.csdn.net/download/stoneniqiu/12347028 

 Web版原始碼:https://download.csdn.net/download/stoneniqiu/12347167 

沒有積分的可以關注我的訂閱號,回覆語音合