首页> 外文OA文献 >Noise Robust Keyword Spotting Using Deep Neural Networks For Embedded Platforms
【2h】

Noise Robust Keyword Spotting Using Deep Neural Networks For Embedded Platforms

机译:使用深度神经网络对嵌入式平台进行噪声健壮的关键字识别

摘要

The recent development of embedded platforms along with spectacular growth in communication networking technologies is driving the Internet of things to thrive. More complex tasks are now possible to operate in small devices such as speech recognition and keyword spotting which are in great demand. Traditional voice recognition approaches are already being used in several embedded applications, some are hybrid(cloud-based and embedded) while others are fully embedded. However, the environment surrounding the embedded devices is usually accompanied by noise. Conventional approaches to add noise robustness to speech recognition are effective but also costly in terms of memory consumption and hardware complexities which limit their use in embedded platforms. The purpose of this thesis is to increase the robustness of keyword spotting to more than one type of noise at once without increasing the memory footprint or the need for a denoiser while maintaining the recognition accuracy to an acceptable level. In this work, robustness in treated at the phoneme classification level as the phoneme based keyword spotting is the best technique for embedded keyword spotting.Deep Neural Networks have been successfully deployed in many applications including noise robust speech recognition. In this work, we use mutil-condition utterances training of a Deep Neural Networks model to increase the keyword spotting noise robustness. This technique is also used for a Gaussian mixture model training. The two approaches are compared and the deep learning proved to not only outperform the Gaussian approach, but has also outperformed the use of a denoiser system. This results in a smaller, more accurate and noise robust model for phoneme recognition.
机译:嵌入式平台的最新发展以及通信网络技术的迅猛发展正在推动物联网蓬勃发展。现在,更复杂的任务可以在小型设备中进行操作,例如语音识别和关键字查找,它们的需求量很大。传统的语音识别方法已经在几种嵌入式应用程序中使用,有些是混合的(基于云和嵌入式),而另一些则是完全嵌入式的。但是,嵌入式设备周围的环境通常会伴随噪声。将噪声鲁棒性添加到语音识别的常规方法是有效的,但是在存储器消耗和硬件复杂性方面也很昂贵,这限制了它们在嵌入式平台中的使用。本文的目的是在不增加存储占用空间或对降噪器的需求的同时,将关键词识别的鲁棒性提高到一种以上的噪声,同时将识别精度保持在可接受的水平。在这项工作中,在音素分类级别上将鲁棒性作为基于音素的关键字点检是嵌入式关键字点检的最佳技术。深度神经网络已成功应用于包括噪声鲁棒语音识别在内的许多应用程序中。在这项工作中,我们使用了深度神经网络模型的多用途条件话语训练来增加关键字点噪声的鲁棒性。此技术还用于高斯混合模型训练。比较了这两种方法,事实证明深度学习不仅胜过高斯方法,而且胜过使用降噪器系统。这导致用于音素识别的更小,更准确和更强噪声的模型。

著录项

  • 作者

    Abdelmoula Ramzi;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号