Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features

Ma Zhanyu; Yu Hong; Chen Wei; Guo Jun

首页> 外文期刊>IEEE Transactions on Vehicular Technology >Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features

【24h】

Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features

机译：具有时标修改和深瓶颈特征的智能汽车中基于短说话的语音语言识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Conversations in the intelligent vehicles are usually short utterance. As the durations of the short utterances are small (e.g., less than 3 s), it is difficult to learn sufficient information to distinguish the type of languages. In this paper, we propose an end-to-end short utterances based speech language identification (SLI) approach, which is especially suitable for the short utterance based language identification. This approach is implemented with a long short-term memory (LSTM) neural network, which is designed for the SLI application in intelligent vehicles. The features used for LSTM learning are generated by a transfer learning method. The bottleneck features of a deep neural network, which are obtained for a mandarin acoustic-phonetic classifier, are used for the LSTM training. In order to improve the SLD accuracy with short utterances, a phase vocoder based time-scale modification method is utilized to reduce/increase the speech rate of the test utterance. By connecting the normal, speech rate reduced, and speech rate increased utterances, we can extend the length of the test utterances such that the performance of the SLI system is improved. The experimental results on the AP17-OLR database demonstrate that the proposed method can improve the performance of SLD, especially on short utterance. The proposed SLI has robust performance under the vehicular noisy environment.

机译：智能车辆中的对话通常是简短的话语。由于短发声的持续时间较小（例如，小于3 s），因此难以学习足够的信息来区分语言类型。在本文中，我们提出了一种基于端到端的基于短话语的语音语言识别（SLI）方法，该方法特别适用于基于短话语的语言识别。这种方法是通过长短期记忆（LSTM）神经网络实现的，该网络专为智能汽车中的SLI应用而设计。用于LSTM学习的功能是通过转移学习方法生成的。通过普通话声学分类器获得的深度神经网络的瓶颈特征被用于LSTM训练。为了提高短发声的SLD精度，利用基于相位声码器的时标修改方法来降低/提高测试发声的语速。通过连接正常语音，语音速率降低的语音和语音速率增加的语音，我们可以扩展测试语音的长度，从而提高SLI系统的性能。在AP17-OLR数据库上的实验结果表明，该方法可以提高SLD的性能，特别是在短发声方面。所提出的SLI在车辆嘈杂的环境下具有强大的性能。

著录项

来源
《IEEE Transactions on Vehicular Technology》 |2019年第1期|121-128|共8页
作者
Ma Zhanyu; Yu Hong; Chen Wei; Guo Jun;
展开▼
作者单位

Beijing Univ Posts & Telecommun, Pattern Recognit & Intelligent Syst Lab, Beijing 100876, Peoples R China;

Ludong Univ, Sch Informat & Elect Engn, Yantai 264000, Peoples R China;

Beijing Sogou Technol Dev Co Ltd, Beijing 100000, Peoples R China;

Beijing Univ Posts & Telecommun, Pattern Recognit & Intelligent Syst Lab, Beijing 100876, Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech language identification; time-scale modification; DNN-BN feature; LSTM;

机译：语音识别;时标修改;DNN-BN特性;LSTM;
入库时间 2022-08-18 04:12:11

相似文献

外文文献
中文文献
专利

1. Bottleneck Feature-Based Hybrid Deep Autoencoder Approach for Indian Language Identification [J] . Himanish Shekhar Das, Pinki Roy Arabian Journal for Science and Engineering. Section A, Sciences . 2020,第4期

机译：基于瓶颈特征的混合深度自动编码器用于印度语言识别
2. Frame-by-frame language identification in short utterances using deep neural networks [J] . Gonzalez-Dominguez Javier, Lopez-Moreno Ignacio, Moreno Pedro J., Neural Networks: The Official Journal of the International Neural Network Society . 2015,第Null期

机译：使用深度神经网络在短话语中逐帧识别语言
3. Residual convolutional neural network with attentive feature pooling for end-to-end language identification from short-duration speech [J] . Monteiro Joao, Alam Jahangir, Falk Tiago H. Computer speech and language . 2019,第NOVa期

机译：带有注意力特征池的残差卷积神经网络用于从短时语音识别端到端语言
4. Deep Neural Networks for i-Vector Language Identification of Short Utterances in Cars [C] . Omid Ghahabi, Antonio Bonafonte, Javier Hernando, Annual Conference of the International Speech Communication Association . 2016

机译：用于汽车简短话语的I-Vector语言识别深神经网络
5. Applying Machine and Statistical Learning Techniques to Intelligent Transport Systems: Bottleneck Identification and Prediction, Dynamic Travel Time Prediction, Driver Stoprun Behavior Modeling, and Autonomous Vehicle Control at Intersections [D] . Elhenawy, Mohammed Mamdouh Zakaria. 2015

机译：将机器和统计学习技术应用于智能交通系统：瓶颈识别和预测，动态行驶时间预测，驾驶员停车行为模型以及交叉口的自主车辆控制
6. Deep Bottleneck Features for Spoken Language Identification [O] . Bing Jiang, Yan Song, Si Wei, -1

机译：口语识别的深层瓶颈功能
7. Time-scale modification of speech based on short-time Fourier analysis. [O] . Portnoff, Michael Rodney 1978

机译：基于短时傅立叶分析的语音时标修改。

Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features

摘要

著录项

相似文献

相关主题

期刊订阅