首页> 外文会议>9th International conference on language resources and evaluation >KALAKA-3: a database for the recognition of spoken European languages on YouTube audios
【24h】

KALAKA-3: a database for the recognition of spoken European languages on YouTube audios

机译:KALAKA-3:用于在YouTube音频中识别欧洲口语的数据库

获取原文

摘要

This paper describes the main features of KALAKA-3, a speech database specifically designed for the development and evaluation of language recognition systems. The database provides TV broadcast speech for training, and audio data extracted from YouTube videos for tuning and testing. The database was created to support the Albayzin 2012 Language Recognition Evaluation, which featured two language recognition tasks, both dealing with European languages. The first one involved six target languages (Basque, Catalan, English, Galician, Portuguese and Spanish) for which there was plenty of training data, whereas the second one involved four target languages (French, German, Greek and Italian) for which no training data was provided. Two separate sets of YouTube audio files were provided to test the performance of language recognition systems on both tasks. To allow open-set tests, these datasets included speech in 11 additional (Out-Of-Set) European languages. The paper also presents a summary of the results attained in the evaluation, along with the performance of state-of-the-art systems on the four evaluation tracks defined on the database, which demonstrates the extreme difficulty of some of them. As far as we know, this is the first database specifically designed to benchmark spoken language recognition technology on YouTube audios.
机译:本文介绍了KALAKA-3的主要功能,KALAKA-3是专门为语言识别系统的开发和评估而设计的语音数据库。该数据库提供用于培训的电视广播语音,以及从YouTube视频中提取的音频数据以进行调整和测试。创建该数据库是为了支持Albayzin 2012语言识别评估,该评估具有两项语言识别任务,均处理欧洲语言。第一种语言涉及六种目标语言(巴斯克语,加泰罗尼亚语,英语,加利西亚语,葡萄牙语和西班牙语),并且有大量的培训数据,而第二种语言涉及四种目标语言(法语,德语,希腊语和意大利语),而这些语言没有任何培训提供了数据。提供了两套独立的YouTube音频文件集,以测试这两项任务中语言识别系统的性能。为了进行开放式测试,这些数据集包含了11种其他(非预设)欧洲语言的语音。本文还概述了评估中获得的结果,以及在数据库中定义的四个评估轨道上的最新系统的性能,这表明了其中一些评估的极端困难。据我们所知,这是第一个专门设计用于对YouTube音频上的口语识别技术进行基准测试的数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号