首页> 外文期刊>Neurocomputing >Online phoneme recognition using multi-layer perceptron networks combined with recurrent non-linear autoregressive neural networks with exogenous inputs
【24h】

Online phoneme recognition using multi-layer perceptron networks combined with recurrent non-linear autoregressive neural networks with exogenous inputs

机译:使用多层感知器网络结合具有外部输入的递归非线性自回归神经网络的在线音素识别

获取原文
获取原文并翻译 | 示例

摘要

Off-line pattern recognition in speech signals is a complex task. Yet, this task becomes harder when the recognition result is required online or in real-time. The present work proposes an online identification of the Portuguese language phonemes using a non-linear autoregressive model with exogenous inputs, commonly called NARX. The process first conditions the input speech signal, and extracts its frequency characteristics. Then it pre-classifies the extracted features into one of the ten possible groups of phonemes, as available in the Portuguese language. This pre-classification is done using a multilayer perceptron network (MLP) with a supervised learning. Subsequently, the MLP output vector, together with the vector that carries the input frequencies, feeds a NARX neural network by means of a temporal delay of four times and feed-backward recurrent links that encompass the results of all hidden layers of the network. As a result of this process, the proposed phoneme recognition process improves the accuracy of an online identification of the Portuguese spoken phonemes during a natural conversation. When the phoneme input signal is well conditioned and continuous over time, the proposed recognition process can provide the correct classification in real-time, with an acceptable accuracy rate. (C) 2017 Elsevier B.V. All rights reserved.
机译:语音信号中的离线模式识别是一项复杂的任务。然而,当在线或实时地需要识别结果时,该任务变得更加困难。本工作提出了使用带有外来输入的非线性自回归模型(通常称为NARX)对葡萄牙语音素进行在线识别的方法。该过程首先调节输入的语音信号,并提取其频率特性。然后,它将提取的功能预分类为十种可能的音素组之一,这些音素都可以使用葡萄牙语。使用具有监督学习功能的多层感知器网络(MLP)可以完成此预分类。随后,MLP输出向量与携带输入频率的向量一起,通过四倍的时间延迟和包含网络所有隐藏层结果的前向递归链接,为NARX神经网络提供反馈。作为此过程的结果,拟议的音素识别过程提高了自然对话期间葡萄牙语口语音素在线识别的准确性。当音素输入信号处于良好状态并随时间连续变化时,建议的识别过程可以以可接受的准确率实时提供正确的分类。 (C)2017 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2017年第22期|78-90|共13页
  • 作者单位

    Univ Estado Rio De Janeiro, Engn Fac, Dept Elect Engn & Telecommun, Rio De Janeiro, Brazil|Univ Estado Rio De Janeiro, Engn Fac, Dept Syst Engn & Computat, Rio De Janeiro, Brazil;

    Univ Estado Rio De Janeiro, Engn Fac, Dept Elect Engn & Telecommun, Rio De Janeiro, Brazil|Univ Estado Rio De Janeiro, Engn Fac, Dept Syst Engn & Computat, Rio De Janeiro, Brazil;

    Univ Estado Rio De Janeiro, Engn Fac, Dept Elect Engn & Telecommun, Rio De Janeiro, Brazil|Univ Estado Rio De Janeiro, Engn Fac, Dept Syst Engn & Computat, Rio De Janeiro, Brazil;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Pattern recognition; Phoneme identification; Multi-layer perceptron; NARX neural networks; Dynamic neural networks;

    机译:模式识别;音素识别;多层感知器;NARX神经网络;动态神经网络;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号