首页> 外文期刊>International Journal of Applied Mathematics and Computer Science >TOWARDS SPIKE-BASED SPEECH PROCESSING: A BIOLOGICALLY PLAUSIBLE APPROACH TO SIMPLE ACOUSTIC CLASSIFICATION
【24h】

TOWARDS SPIKE-BASED SPEECH PROCESSING: A BIOLOGICALLY PLAUSIBLE APPROACH TO SIMPLE ACOUSTIC CLASSIFICATION

机译:迈向基于语音的语音处理:一种简单可行的生物分类方法

获取原文
获取原文并翻译 | 示例

摘要

Shortcomings of automatic speech recognition (ASR) applications are becoming more evident as they are more widely used in real life. The inherent non-stationarity associated with the timing of speech signals as well as the dynamical changes in the environment make the ensuing analysis and recognition extremely difficult. Researchers often turn to biology seeking clues to make better engineered systems, and ASR is no exception with the usage of feature sets such as Mel frequency cepstral coefficients, which employ filter banks similar to cochlear filter banks in frequency distribution and bandwidth. In this paper, we delve deeper into the mechanics of the human auditory system to take this biological inspiration to the next level. The main goal of this research is to investigate the computation potential of spike trains produced at the early stages of the auditory system for a simple acoustic classification task. First, various spike coding schemes from temporal to rate coding are explored, together with various spike-based encoders with various simplicity levels such as rank order coding and liquid state machine. Based on these findings, a biologically plausible system architecture is proposed for the recognition of phonetically simple acoustic signals which makes exclusive use of spikes for computation. The performance tests show superior performance on a noisy vowel data set when compared with a conventional ASR system.
机译:随着自动语音识别(ASR)应用在现实生活中的广泛应用,其缺点变得越来越明显。与语音信号的定时相关联的固有的非平稳性以及环境中的动态变化使得随后的分析和识别极为困难。研究人员经常转向生物学寻求线索来制造更好的工程系统,ASR也不例外,例如使用Mel频率倒谱系数等特征集,该特征集在频率分布和带宽上采用类似于耳蜗滤波器组的滤波器组。在本文中,我们将对人类听觉系统的机制进行更深入的研究,以将这种生物学灵感带入一个新的水平。这项研究的主要目的是研究在听觉系统的早期阶段产生的尖峰序列对于简单的声学分类任务的计算潜力。首先,探索了从时间编码到速率编码的各种尖峰编码方案,以及具有各种简单程度的各种基于尖峰的编码器,例如秩序编码和液态状态机。基于这些发现,提出了一种生物学上可行的系统架构,用于识别语音简单的声学信号,该信号仅使用尖峰信号进行计算。与常规的ASR系统相比,性能测试表明在嘈杂的元音数据集上具有优越的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号