...
首页> 外文期刊>IEEE transactions on audio, speech and language processing >An overview of automatic speaker diarization systems
【24h】

An overview of automatic speaker diarization systems

机译:扬声器自动扩音系统概述

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization can be used for helping speech recognition, facilitating the searching and indexing of audio archives, and increasing the richness of automatic transcriptions, making them more readable. In this paper, we provide an overview of the approaches currently used in a key area of audio diarization, namely speaker diarization, and discuss their relative merits and limitations. Performances using the different techniques are compared within the framework of the speaker diarization task in the DARPA EARS Rich Transcription evaluations. We also look at how the techniques are being introduced into real broadcast news systems and their portability to other domains and tasks such as meetings and speaker verification.
机译:音频二值化是用信息注释输入音频通道的过程,该信息将信号能量的时间区域(可能重叠)归因于其特定来源。这些源可以包括特定的扬声器,音乐,背景噪声源以及其他信号源/通道特征。 Diarization可用于帮助语音识别,促进音频档案的搜索和索引,以及增加自动转录的丰富程度,使其更具可读性。在本文中,我们概述了当前音频扩音关键领域(即扬声器扩音)中使用的方法,并讨论了它们的相对优缺点。在DARPA EARS Rich Transcription评估中,在说话人差异化任务的框架内比较了使用不同技术的演奏。我们还将研究如何将这些技术引入真实的广播新闻系统中,以及它们在其他领域和任务(例如会议和演讲者验证)中的可移植性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号