首页> 外文期刊>IEEE Transactions on Vehicular Technology >Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features
【24h】

Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features

机译:具有时标修改和深瓶颈特征的智能汽车中基于短说话的语音语言识别

获取原文
获取原文并翻译 | 示例
       

摘要

Conversations in the intelligent vehicles are usually short utterance. As the durations of the short utterances are small (e.g., less than 3 s), it is difficult to learn sufficient information to distinguish the type of languages. In this paper, we propose an end-to-end short utterances based speech language identification (SLI) approach, which is especially suitable for the short utterance based language identification. This approach is implemented with a long short-term memory (LSTM) neural network, which is designed for the SLI application in intelligent vehicles. The features used for LSTM learning are generated by a transfer learning method. The bottleneck features of a deep neural network, which are obtained for a mandarin acoustic-phonetic classifier, are used for the LSTM training. In order to improve the SLD accuracy with short utterances, a phase vocoder based time-scale modification method is utilized to reduce/increase the speech rate of the test utterance. By connecting the normal, speech rate reduced, and speech rate increased utterances, we can extend the length of the test utterances such that the performance of the SLI system is improved. The experimental results on the AP17-OLR database demonstrate that the proposed method can improve the performance of SLD, especially on short utterance. The proposed SLI has robust performance under the vehicular noisy environment.
机译:智能车辆中的对话通常是简短的话语。由于短发声的持续时间较小(例如,小于3 s),因此难以学习足够的信息来区分语言类型。在本文中,我们提出了一种基于端到端的基于短话语的语音语言识别(SLI)方法,该方法特别适用于基于短话语的语言识别。这种方法是通过长短期记忆(LSTM)神经网络实现的,该网络专为智能汽车中的SLI应用而设计。用于LSTM学习的功能是通过转移学习方法生成的。通过普通话声学分类器获得的深度神经网络的瓶颈特征被用于LSTM训练。为了提高短发声的SLD精度,利用基于相位声码器的时标修改方法来降低/提高测试发声的语速。通过连接正常语音,语音速率降低的语音和语音速率增加的语音,我们可以扩展测试语音的长度,从而提高SLI系统的性能。在AP17-OLR数据库上的实验结果表明,该方法可以提高SLD的性能,特别是在短发声方面。所提出的SLI在车辆嘈杂的环境下具有强大的性能。

著录项

  • 来源
    《IEEE Transactions on Vehicular Technology》 |2019年第1期|121-128|共8页
  • 作者单位

    Beijing Univ Posts & Telecommun, Pattern Recognit & Intelligent Syst Lab, Beijing 100876, Peoples R China;

    Ludong Univ, Sch Informat & Elect Engn, Yantai 264000, Peoples R China;

    Beijing Sogou Technol Dev Co Ltd, Beijing 100000, Peoples R China;

    Beijing Univ Posts & Telecommun, Pattern Recognit & Intelligent Syst Lab, Beijing 100876, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Speech language identification; time-scale modification; DNN-BN feature; LSTM;

    机译:语音识别;时标修改;DNN-BN特性;LSTM;
  • 入库时间 2022-08-18 04:12:11

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号