首页> 外文学位 >Computational models for binaural sound source localization and sound understanding.

【24h】

Computational models for binaural sound source localization and sound understanding.

机译：用于双耳声源定位和声音理解的计算模型。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

As one of humans' primary sensors, the auditory system plays an important role in language acquisition. Computational models for binaural sound source localization and sound source understanding are proposed in this thesis. The models build a fundamental auditory system for a mobile robot that will automatically learn language through multisensory inputs and interaction with the external environment. A hypothesis-driven approach is followed for the localization model. Using only binaural inputs, it enables three-dimensional (3D) localization by combining multiple cues. Two binaural localization cues, interaural time differences (ITDs) and interaural intensity differences (IIDs), and one monoaural localization cue, spectral cues, are extracted from the input sounds. A Bayes rule-based hierarchical framework is applied for decision making. Simulations show the effectiveness of the model. A robust ITD estimation algorithm is introduced and implemented on the robot. Satisfactory results are achieved under real-world environments. A multimodal learning scheme is proposed with the aid of vision to realize autonomous learning for the 3D binaural localization. No human instructors need to be involved. A generic model is presented for sound source understanding. No labelled training data is required to build the model. A histogram is employed as the sound representation, where the time-varying characteristics of sound can be preserved. Histogram intersection is used as the similarity measurement between different sounds. The model is successfully applied to content-based audio information retrieval and automatic audio indexing systems.

机译：作为人类的主要传感器之一，听觉系统在语言习得中起着重要作用。本文提出了双耳声源定位和声源理解的计算模型。这些模型为移动机器人构建了基本的听觉系统，该系统将通过多感官输入以及与外部环境的互动自动学习语言。本地化模型遵循假设驱动的方法。仅使用双耳输入，它可以通过组合多个提示来实现三维（3D）定位。从输入声音中提取两个双耳定位提示，即耳间时间差（ITD）和听觉强度差（IID），以及一个单耳定位提示，即频谱提示。基于贝叶斯规则的分层框架可用于决策。仿真表明了该模型的有效性。引入了鲁棒的ITD估计算法并在机器人上实现。在实际环境中取得令人满意的结果。提出了一种基于视觉的多模式学习方案，以实现3D双耳定位的自主学习。无需人工指导。提出了用于了解声源的通用模型。无需标记培训数据即可构建模型。使用直方图作为声音表示，可以保留声音的时变特性。直方图相交被用作不同声音之间的相似性度量。该模型已成功应用于基于内容的音频信息检索和自动音频索引系统。

著录项

作者
Li, Danfeng.;
展开▼
作者单位

University of Illinois at Urbana-Champaign.;

展开▼
授予单位 University of Illinois at Urbana-Champaign.;
学科 Engineering Electronics and Electrical.; Computer Science.
学位 Ph.D.
年度 2003
页码 108 p.
总页数 108
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Robust Binaural Localization of a Target Sound Source by Combining Spectral Source Models and Deep Neural Networks [J] . Ning Ma, Jose A. Gonzalez, Guy J. Brown Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018,第11期

机译：结合频谱源模型和深层神经网络对目标声源进行稳健的双耳本地化
2. Noise source separation of diesel engine by combining binaural sound localization method and blind source separation method [J] . Jiachi Yao, Yang Xiang, Sichong Qian, Mechanical systems and signal processing . 2017,第NOVa期

机译：双耳声定位法与盲源分离法相结合的柴油机噪声源分离
3. Active binaural localization of multiple sound sources [J] . Zhong Xuan, Sun Liang, Yost William Robotics and Autonomous Systems . 2016,第Null期

机译：多种声音源的主动双耳定位
4. Bio-inspired sound source localization compensated for sound diffraction by binaural head and torso [C] . Shimoyama Ryuichi 2012 IEEE International Conference on Computational Intelligence and Cybernetics. . 2012

机译：受生物启发的声源定位补偿了双耳头和躯干的声音衍射
5. Sound, central auditory nervous system function and human gait: The effect of quiet and localized sound sources on the gait of people with normal and atypical central auditory nervous system function [D] . Hubbeling, Charles Robert 1999

机译：声音，中枢听觉神经系统功能和步态：安静且局部的声源对正常和非典型中枢听觉神经系统功能者的步态的影响
6. Cellular Computations Underlying Detection of Gaps in Sounds and Lateralizing Sound Sources [O] . Donata Oertel, Xiao-Jie Cao, James R. Ison, -1

机译：声音中的间隙检测和声源横向化的基础细胞计算
7. A binaural sound source localization model based on time-delay compensation and interaural coherence [O] . Hong Liu, Jie Zhang 2014

机译：基于时间延迟补偿和腔室相干的双耳声源定位模型

Computational models for binaural sound source localization and sound understanding.

摘要

著录项

相似文献

相关主题

期刊订阅