首页> 外文会议>International Conference on Speech and Computer >A Free Synthetic Corpus for Speaker Diarization Research
【24h】

A Free Synthetic Corpus for Speaker Diarization Research

机译:一种免费的扬声器日复变化研究语料库

获取原文

摘要

A synthetic corpus of dialogs was constructed from the Libri-Speech corpus, and is made freely available for diarization research. It includes over 90 h of training data, and over 9 h each of development and test data. Both 2-person and 3-person dialogs, with and without overlap, are included. Timing information is provided in several formats, and includes not only speaker segmentations, but also phoneme segmentations. As such, it is a useful starting point for general, particularly early-stage, diarization system development.
机译:从Libli-alical语料库构建对话的合成语料,并自由地用于日复性化研究。它包括超过90小时的培训数据,并且每个开发和测试数据超过9小时。包括两个人和3人对话,其中包含和不重叠。定时信息以多种格式提供,不仅包括扬声器分段,还包括音素分段。因此,这是一般,特别是早期日复一年的日复期系统开发的有用起点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号