【24h】

The DIRHA simulated corpus

机译:DIRHA模拟语料库

获取原文

摘要

This paper describes a multi-microphone multi-language acoustic corpus being developed under the EC project Distant-speech Interaction for Robust Home Applications (DIRHA). The corpus is composed of several sequences obtained by convolution of dry acoustic events with more than 9000 impulse responses measured in a real apartment equipped with 40 microphones. The acoustic events include in-domain sentences of different typologies uttered by native speakers in four different languages and non-speech events representing typical domestic noises. To increase the realism of the resulting corpus, background noises were recorded in the real home environment and then added to the generated sequences. The purpose of this work is to describe the simulation procedure and the data sets that were created and used to derive the corpus. The corpus contains signals of different characteristics making it suitable for various multi-microphone signal processing and distant speech recognition tasks.
机译:本文介绍了在EC项目“健壮家庭应用的远距离语音交互”(DIRHA)下正在开发的多麦克风多语言声学语料库。语料库由几个序列组成,这些序列是由干燥声事件的卷积获得的,在配备40个麦克风的真实公寓中测得的脉冲响应超过9000个。声音事件包括母语人士以四种不同语言说出的不同类型的域内句子,以及代表典型家庭噪音的非语音事件。为了提高所得语料库的真实性,在真实家庭环境中记录了背景噪音,然后将其添加到生成的序列中。这项工作的目的是描述仿真过程以及创建并用于导出语料库的数据集。语料库包含具有不同特征的信号,使其适用于各种多麦克风信号处理和远距离语音识别任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号