文档介绍:
基于深度学习模型的语音特征提取方法研
究
梁静,刘刚**
5
10
15
20
25
30
35
40
(北京邮电大学信息与通信工程学院,北京 100876)
摘要:随着移动互联网的发展,语音识别作为实现人机自由交互的关键技术越来越受到重
视。特别是目前已经进入大数据时代,海量语音数据的获取已经成为可能,如何有效利用
这些未经标注的原始数据成为当前语音识别领域的一个研究热点。与此同时,深度学习模
型凭借着其对海量数据所具备的强大建模能力,能够直接对这些未标注数据进行处理,与
语音识别的联系愈加紧密。本文在语音识别与深度学习理论相结合的基础上,针对如何利
用深度学习模型提取更为鲁棒的声学特征这一问题展开研究,分别采用了自动编码器和深
度神经网络两种模型通过无监督和有监督训练方法实现从原始语音特征中自动提取新特征。
基于上述两种模型提取的新特征和原始 相比,在词识别正确率方面分别提高了
%和 %。
关键词:语音识别;深度神经网络;深度自动编码器;特征提取
中图分类号:TP181
Speech Feature Extraction Based on Deep Learning Models
LIANG Jing, LIU Gang
(School of Information and munication, Beijing University of Posts and
munications, Beijing 100876)
Abstract: With the rapid development of Mobile , speech recognition which remains key
to human-machine interaction is attracting more and more attention. Especially in the era of big
data when access to large amount of speech data is possible, exploring ways to manipulate these
unlabeled data effectively has e the hot topic in the field of speech recognition. Meanwhile,
deep learning models find more opportunities bination with speech recognition
technologies because of its outstanding performance in unlabeled speech corpus processing and
data modeling. This thesis is based on bination of speech recognition and deep learning
theory, aims at extracting more robust acoustic features with application of deep models. In the
ex