首页> 外文会议>International Conference on speech and computer >A Free Synthetic Corpus for Speaker Diarization Research
【24h】

A Free Synthetic Corpus for Speaker Diarization Research

机译:用于说话人差异化研究的免费合成语料库

获取原文

摘要

A synthetic corpus of dialogs was constructed from the Libri-Speech corpus, and is made freely available for diarization research. It includes over 90 h of training data, and over 9 h each of development and test data. Both 2-person and 3-person dialogs, with and without overlap, are included. Timing information is provided in several formats, and includes not only speaker segmentations, but also phoneme segmentations. As such, it is a useful starting point for general, particularly early-stage, diarization system development.
机译:从Libri-Speech语料库构建了一个对话的综合语料库,并免费提供给进行差异化研究。它包括90多个小时的培训数据,以及每个9个小时以上的开发和测试数据。包括2人对话和3人对话,有或没有重叠。定时信息以几种格式提供,不仅包括说话者细分,还包括音素细分。这样,它对于一般的,特别是早期的二值化系统开发是有用的起点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号