Using Values of the Human Cochlea in the Macro and Micro Mechanical Model for Automatic Speech Recognition

机译：在宏观和微观力学模型中使用人耳蜗的值进行自动语音识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this hearing organ in mammalians is the most important element used to make a transduction of the sound pressure that is received by the outer ear. This paper shows how the macro and micro mechanical model is used in ASR tasks. The values that Neely, Elliot and Ku founded in their works, related with the macro and micro mechanical model such as Neely were used to set the central frequencies of a bank filter to obtain parameters from the speech in a similar form as MFCC (Mel Frequency Cepstrum Coefficients) has been constructed. An approach that considers a new form to distribute the bank filter in our parametric representation is proposed. Then this distribution of the bank filter to have a different representation of the speech in frequency domain compared with MFCC is applied. The response of these three values mentioned above into macro and micro mechanical model to create the central frequencies of the bank filter were used, then the Mel scale function substituted by a representation based in the cochlear response based on the Neely model. This model was used with a set of different parameters of the cochlea, used by Nelly, Elliot and Ku in their works, such as mass, damping and stiffness; among others. A performance of 98 to 100% was reached for a task that uses Spanish isolated digits pronounced by 5 different speakers. Corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was applied.

机译：最近，使用耳蜗行为的参数表示已被用于与自动语音识别（ASR）相关的不同研究中。这是因为哺乳动物的听力器官是用来转换外耳所接收声压的最重要元素。本文展示了如何在ASR任务中使用宏观和微观力学模型。 Neely，Elliot和Ku在他们的工作中建立的与宏观和微观力学模型（例如Neely）相关的值被用来设置存储滤波器的中心频率，从而以类似于MFCC的形式从语音中获取参数（Mel Frequency）倒谱系数已构建。提出了一种考虑新形式以在参数表示中分布库滤波器的方法。然后应用与MFCC相比在频域中具有不同语音表示的库滤波器的这种分布。使用上述三个值在宏观和微观力学模型中的响应，以创建滤波器组的中心频率，然后用基于Neely模型的耳蜗响应中的表示形式代替Mel尺度函数。这个模型与Nelly，Elliot和Ku在他们的作品中使用的一组不同的耳蜗参数一起使用，例如质量，阻尼和刚度。其中。使用由5个不同的发音者发音的西班牙语孤立的数字来完成的任务，其性能达到98％至100％。语料库SUSAS具有中性的声音记录，与MFCC相比具有一些优势。

著录项

来源
《Mexican international conference on artificial intelligence》|2014年|242-251|共10页
会议地点
作者
Jose Luis Oropeza Rodriguez; Sergio Suarez Guerra;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech recognition; cochlea; place theory and bank filter;

机译：语音识别;耳蜗场所理论和银行过滤器;

相似文献

外文文献
中文文献
专利

1. Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L) [J] . Odette Scharenborg, Louis ten Bosch, Lou Boves, The Journal of the Acoustical Society of America . 2003,第6期

机译：桥接自动语音识别和心理语言学：将候选清单扩展到人类语音识别的端到端模型（L）
2. Speech Encoding in the Human Auditory Periphery: Modeling and Quantitative Assessment by Means of Automatic Speech Recognition [J] . Holmberg Marcus Fortschritt-Berichte VDI, Reihe 8. Mess-, Steuerungs- und Regelungstechnik . 2009,第1162期

机译：人类听觉外围的语音编码：借助自动语音识别的建模和定量评估
3. Critique: The potential role of speech production models in automatic speech recognition [J] . Roger K. Moore The Journal of the Acoustical Society of America . 1996,第3期

机译：批评：语音产生模型在自动语音识别中的潜在作用
4. Using Values of the Human Cochlea in the Macro and Micro Mechanical Model for Automatic Speech Recognition [C] . Jose Luis Oropeza Rodriguez, Sergio Suarez Guerra Mexican International Conference on Artificial Intelligence . 2014

机译：使用宏观和微机械模型的人耳蜗价值进行自动语音识别
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference [O] . Byeongwook Lee, Kwang-Hyun Cho -1

机译：以语音包络作为时间参考的自动语音识别的大脑启发式语音分割
7. Modelling Human Speech Recognition using Automatic Speech Recognition Paradigms in SpeM [O] . Scharenborg O.E., McQueen J.M., Bosch L.F.M. ten, 2003

机译：在SpeM中使用自动语音识别范例对人类语音识别进行建模

Using Values of the Human Cochlea in the Macro and Micro Mechanical Model for Automatic Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅