1. 程式人生 > >深度學習語言增強

深度學習語言增強

作者:YeBobr
連結:https://www.zhihu.com/question/273665262/answer/388296862
來源:知乎
著作權歸作者所有。商業轉載請聯絡作者獲得授權,非商業轉載請註明出處。

最近在深度學習在語音增強中的應用最前沿的應該數GAN網路了吧,把生成器當做增強網路,用判別器區分乾淨語音和增強語音。主要有如下兩篇論文:

1.SEGAN: Speech Enhancement Generative Adversarial Network

2.Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verifica

tion

 

在卷積神經網路方面,有基於全卷積的,有基於冗餘卷積的,在時域上和在頻域上處理語音。論文連結如下:

1.Single channel speech enhancement using convolutional neural network

2.A FULLY CONVOLUTIONAL NEURAL NETWORK FOR SPEECH ENHANCEMENT

3.Raw Waveform-based Speech Enhancement by Fully Convolutional Networks

 

在DNN方面,主要是在頻域內處理語音,通過短時傅立葉變換求得短時頻譜,然後對短時頻譜進行處理,利用含噪語音的相位進行重構增強語音。還有一些小是DNN和傳統語音增強方法進行結合的辦法,把傳統語音中的features換成DNN網路,基本這個套路。論連結如下:

1.Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks

2.NMF-based Speech Enhancement Incorporating Deep Neural Network

3.A Novel Single Channel Speech Enhancement Based on Joint Deep Neural Network and Wiener Filter

4.An Experimental Study on Speech Enhancement Based on Deep Neural Networks

5.A Regression Approach to Speech Enhancement Based on Deep Neural Networks