首页> 外文学位 >Representation of speech in the primary auditory cortex and its implications for robust speech processing.
【24h】

Representation of speech in the primary auditory cortex and its implications for robust speech processing.

机译:初级听觉皮层中语音的表示及其对健壮语音处理的影响。

获取原文
获取原文并翻译 | 示例

摘要

Speech has evolved as a primary form of communication between humans. This most used means of communication has been the subject of intense study for years, but there is still a lot that we do not know about it. It is an oft repeated fact, that even the performance of the best speech processing algorithms still lags far behind that of the average human, It seems inescapable that unless we know more about the way the brain performs this task, our machines can not go much further. This thesis focuses on the question of speech representation in the brain, both from a physiological and technological perspective. We explore the representation of speech through the encoding of its smallest elements---phonemic features---in the primary auditory cortex. We report on how population of neurons with diverse tuning properties respond discriminately to phonemes resulting in explicit encoding of their parameters. Next, we show that this sparse encoding of the phonemic features is a simple consequence of the linear spectro-temporal properties of the auditory cortical neurons and that a Spectro-Temporal receptive field model can predict similar patterns of activation. This is an important step toward the realization of systems that operate based on the same principles as the cortex. Using an inverse method of reconstruction, we shall also explore the extent to which phonemic features are preserved in the cortical representation of noisy speech. The results suggest that the cortical responses are more robust to noise and that the important features of phonemes are preserved in the cortical representation even in noise. Finally, we explain how a model of this cortical representation can be used for speech processing and enhancement applications to improve their robustness and performance.
机译:语音已经发展成为人类之间交流的主要形式。多年来,这种最常用的通讯方式一直是研究的主题,但我们对此仍然不了解。经常重复的事实是,即使最好的语音处理算法的性能仍然远远落后于普通人的性能。似乎不可避免的是,除非我们对大脑执行此任务的方式有更多的了解,否则我们的机器将无法前进进一步。本文从生理和技术的角度着眼于大脑中语音表达的问题。我们通过在主要听觉皮层中对语音的最小元素(音素特征)进行编码来探索语音的表示形式。我们报告了具有不同的调整属性的神经元人口如何歧视音素导致其参数的显式编码。接下来,我们表明音素特征的这种稀疏编码是听觉皮层神经元的线性光谱时态特性的简单结果,并且光谱时态接受场模型可以预测类似的激活模式。这是实现基于与皮质相同原理的系统的重要一步。使用逆重构方法,我们还将探索在嘈杂语音的皮层表示中保留音位特征的程度。结果表明,皮层反应对噪声更鲁棒,并且即使在噪声中,音素的重要特征也保留在皮层表示中。最后,我们解释了如何将此皮质表示的模型用于语音处理和增强应用程序,以提高其鲁棒性和性能。

著录项

  • 作者

    Mesgarani, Nima.;

  • 作者单位

    University of Maryland, College Park.;

  • 授予单位 University of Maryland, College Park.;
  • 学科 Biology Neuroscience.;Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 153 p.
  • 总页数 153
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号