首页> 外文会议>International Conference on Auditory Display >OPTIMIZING THE SPATIAL CONFIGURATION OF A SEVEN-TALKER SPEECH DISPLAY
【24h】

OPTIMIZING THE SPATIAL CONFIGURATION OF A SEVEN-TALKER SPEECH DISPLAY

机译:优化七讲话者语音显示的空间配置

获取原文

摘要

Although there is substantial evidence that performance in mul-titalker listening tasks can be improved by spatially separating the apparent locations of the competing talkers, very little effort has been made to determine the best locations and presentation levels for the talkers in a multichannel speech display. In this experiment, a call-sign based color and number identification task was used to evaluate the effectiveness of three different spatial configurations and two different level normalization schemes in a seven-channel binaural speech display. When only two spatially-adjacent channels of the seven-channel system were active, overall performance was substantially better with a geometrically-spaced spatial configuration (with far-field talkers at -90°, -30°, -10°, 0°, +10°, +30°, and +90° azimuth) or a hybrid near-far configuration (with far-field talkers at -90°, -30°, 0°, +30°, and +90° azimuth and near-field talkers at ±90°) than with a more conventional linearly-spaced configuration (with far-field talkers at -90°, -60°, -30°, 0°, +30°, +60°, and +90° azimuth). When all seven channels were active, performance was generally better with a "better-ear" normalization scheme that equalized the levels of the talkers in the more intense ear than with a default normalization scheme that equalized the levels of the talkers at the center of the head. The best overall performance in the seven-talker task occurred when the hybrid near-far spatial configuration was combined with the better-ear normalization scheme. This combination resulted in a 20% increase in the number of correct identifications relative to the baseline condition with linearly-spaced talker locations and no level normalization. Although this is a relatively modest improvement, it should be noted that it could be achieved at little or no cost simply by reconfiguring the HRTFs used in a multitalker speech display.
机译:虽然存在大量证据表明,通过在空间分离竞争对手的明显位置,可以提高MUL-Titalker聆听任务的表现,但已经努力确定多通道语音显示中的讲话者的最佳位置和演示水平。在该实验中,基于呼叫符号的颜色和数字识别任务用于评估三种不同空间配置的有效性和七通道双耳语音显示中的两个不同电平归一化方案。当只有两个七通道系统的空间相邻的通道处于活动状态时,整体性能随着几何间隔的空间配置而基本上更好(在-90°,-30°,-10°,0°,具有远场讲话者, + 10°,+ 30°和+ 90°方位角)或混合近距离配置(在-90°,-30°,0°,0°,+ 30°和+ 90°方位和附近的远场讲话者 - 距离±90°的野生讲话者)比使用更传统的线性间隔配置(在-90°,-60°,-30°,0°,+ 30°,+ 60°和+90的远场讲话者°方位角)。当所有七个频道都处于活动状态时,性能通常更好地具有“更好的耳朵”归一化方案,这些规范方案均衡谈话者在更强烈的耳朵中的讲话者的水平,而不是默认归一化方案,这些规范方案均衡了谈判者在中心的讲话者的级别头。混合近距离空间配置与更好的耳归一化方案相结合时,发生了七讲车任务中的最佳整体性能。这种组合导致相对于基线条件的正确识别数量增加20%,具有线性间隔的讲话者位置,没有水平标准化。虽然这是一个相对较为谦虚的改进,但应该注意,只需重新配置多举行语音显示中使用的HRTF,就可以几乎没有成本实现了它。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号