首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >CN-Celeb: A Challenging Chinese Speaker Recognition Dataset
【24h】

CN-Celeb: A Challenging Chinese Speaker Recognition Dataset

机译:CN-CELEB:一个挑战的中国扬声器识别数据集

获取原文

摘要

Recently, researchers set an ambitious goal of conducting speaker recognition in unconstrained conditions where the variations on ambient, channel and emotion could be arbitrary. However, most publicly available datasets are collected under constrained environments, i.e., with little noise and limited channel variation. These datasets tend to deliver over-optimistic performance and do not meet the request of research on speaker recognition in unconstrained conditions. In this paper, we present CN-Celeb, a large-scale speaker recognition dataset collected `in the wild'. This dataset contains more than 130,000 utterances from 1,000 Chinese celebrities, and covers 11 different genres in real world. Experiments conducted with two state-of-the-art speaker recognition approaches (i-vector and x-vector) show that the performance on CN-Celeb is far inferior to the one obtained on Vox-Celeb, a widely used speaker recognition dataset. This result demonstrates that in real-life conditions, the performance of existing techniques might be much worse than it was thought. Our database is free for researchers and can be downloaded from http://project.cslt.org.
机译:最近,研究人员设定了一个雄心勃勃的目的,在不受约束的条件下进行扬声器识别,在那里环境,渠道和情绪的变化可能是任意的。然而,大多数公开可用的数据集在约束环境下收集,即,噪声和通道变化很小。这些数据集倾向于提供过度乐观的表现,并且不符合在不受约束条件下的扬声器识别研究的要求。在本文中,我们展示了CN-Celeb,一个大规模的扬声器识别数据集在野外收集了“野外”。该数据集包含来自1000名中国名人的130,000多种话语,并涵盖了现实世界中的11种不同类型。用两种最先进的扬声器识别方法(I-Vector和X-Vector)进行的实验表明,CN-CELEB上的性能远低于VOX-CELEB,广泛使用的扬声器识别数据集获得的性能。这结果表明,在现实生活条件下,现有技术的性能可能比想象的要差。我们的数据库免费用于研究人员,可以从http://project.cslt.org下载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号