首页> 外文会议>International Workshop on Machine Learning for Multimodal Interaction >Speech Activity Detection on Multichannels of Meeting Recordings
【24h】

Speech Activity Detection on Multichannels of Meeting Recordings

机译:关于会议记录多通道的语音活动检测

获取原文

摘要

The Purdue SAD system was originally designed to identify speech regions in multichannel meeting recordings with the goal of focusing transcription effort on regions containing speech. In the NIST RT-05S evaluation, this system was evaluated in the ihm condition of the speech activity detection task. The goal for this task condition is to separate the voice of the speaker on each channel from silence and crosstalk. Our system consists of several steps and does not require a training set. It starts with a simple silence detection algorithm that utilizes pitch and energy to roughly separate silence from speech and crosstalk. A global Bayesian Information Criterion (BIC) is integrated with a Viterbi segmentation algorithm that divides the concatenated stream of local speech and crosstalk into homogeneous portions, which allows an energy based clustering process to then separate local speech and crosstalk. The second step makes use of the obtained segment information to iteratively train a Gaussian mixture model for each speech activity category and decode the whole sequence over an ergodic network to refine the segmentation. The final step first uses a cross-correlation analysis to eliminate crosstalk, and then applies a batch of post-processing operations to adjust the segments to the evaluation scenario. In this paper, we describe our system and discuss various issues related to its evaluation.
机译:Purdue SAD系统最初是旨在识别多通道会议录音中的语音区域,其目标是对含有言语的区域进行转录工作。在NIST RT-05S评估中,在语音活动检测任务的IHM条件下评估该系统。此任务条件的目标是将每个通道的扬声器的声音分开,从沉默和串扰中分开。我们的系统由几个步骤组成,不需要培训集。它从一个简单的沉默检测算法开始,利用音调和能量来大致与语音和串扰分开。全球贝叶斯信息标准(BIC)与维特比分割算法集成,该算法将局部语音和串扰的连接流分成均匀部分,这允许能量基础的聚类过程分开本地语音和串扰。第二步骤利用所获得的段信息来迭代地训练每个语音活动类别的高斯混合模型,并在ergodic网络上解码整个序列以优化分割。最后一步首先使用互相关分析来消除串扰,然后应用一批后处理操作以将段调整到评估方案。在本文中,我们描述了我们的系统,并讨论了与其评估相关的各种问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号