KALAKA-3: a database for the recognition of spoken European languages on YouTube audios

机译：KALAKA-3：用于在YouTube音频中识别欧洲口语的数据库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes the main features of KALAKA-3, a speech database specifically designed for the development and evaluation of language recognition systems. The database provides TV broadcast speech for training, and audio data extracted from YouTube videos for tuning and testing. The database was created to support the Albayzin 2012 Language Recognition Evaluation, which featured two language recognition tasks, both dealing with European languages. The first one involved six target languages (Basque, Catalan, English, Galician, Portuguese and Spanish) for which there was plenty of training data, whereas the second one involved four target languages (French, German, Greek and Italian) for which no training data was provided. Two separate sets of YouTube audio files were provided to test the performance of language recognition systems on both tasks. To allow open-set tests, these datasets included speech in 11 additional (Out-Of-Set) European languages. The paper also presents a summary of the results attained in the evaluation, along with the performance of state-of-the-art systems on the four evaluation tracks defined on the database, which demonstrates the extreme difficulty of some of them. As far as we know, this is the first database specifically designed to benchmark spoken language recognition technology on YouTube audios.

机译：本文介绍了KALAKA-3的主要功能，KALAKA-3是专门为语言识别系统的开发和评估而设计的语音数据库。该数据库提供用于培训的电视广播语音，以及从YouTube视频中提取的音频数据以进行调整和测试。创建该数据库是为了支持Albayzin 2012语言识别评估，该评估具有两项语言识别任务，均处理欧洲语言。第一种语言涉及六种目标语言（巴斯克语，加泰罗尼亚语，英语，加利西亚语，葡萄牙语和西班牙语），并且有大量的培训数据，而第二种语言涉及四种目标语言（法语，德语，希腊语和意大利语），而这些语言没有任何培训提供了数据。提供了两套独立的YouTube音频文件集，以测试这两项任务中语言识别系统的性能。为了进行开放式测试，这些数据集包含了11种其他（非预设）欧洲语言的语音。本文还概述了评估中获得的结果，以及在数据库中定义的四个评估轨道上的最新系统的性能，这表明了其中一些评估的极端困难。据我们所知，这是第一个专门设计用于对YouTube音频上的口语识别技术进行基准测试的数据库。

著录项

来源
《9th International conference on language resources and evaluation》|2014年|3567-3573|共7页
会议地点
作者
Luis Javier Rodriguez-Fuentes; Mikel Penagarikano; Amparo Varona; Mireia Diez; German Bordel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Spoken Language Recognition; European languages; YouTube audio;

机译：口语识别;欧洲语言; YouTube音频;
入库时间 2022-08-26 15:17:23

相似文献

外文文献
中文文献
专利

1. KALAKA-3: a database for the assessment of spoken language recognition technology on YouTube audios [J] . Javier Rodriguez-Fuentes Luis, Penagarikano Mikel, Varona Amparo, Language Resources and Evaluation . 2016,第2期

机译：KALAKA-3：用于评估YouTube音频上的口语识别技术的数据库
2. Audiovisual spoken word recognition as a clinical criterion for sensory aids efficiency in Persian-language children with hearing loss [J] . International journal of pediatric otorhinolaryngology . 2015,第12期

机译：视听口语单词识别作为听力障碍波斯语儿童感官辅助功效的临床标准
3. Audiovisual spoken word recognition as a clinical criterion for sensory aids efficiency in Persian-language children with hearing loss [J] . International journal of pediatric otorhinolaryngology . 2015,第12期

机译：视听口语单词识别作为听力障碍波斯语儿童感官辅助功效的临床标准
4. KALAKA-3: a database for the recognition of spoken European languages on YouTube audios [C] . Luis Javier Rodriguez-Fuentes, Mikel Penagarikano, Amparo Varona, 9th International conference on language resources and evaluation . 2014

机译：Kalaka-3：一个数据库，用于在YouTube audios上识别欧洲语言
5. Audio parsing and rapid speaker adaptation in speech recognition for spoken document retrieval. [D] . Zhou, Bowen. 2003

机译：语音识别中的音频解析和快速的说话人自适应，可用于语音文档检索。
6. How vocabulary size in two languages relates to efficiency in spoken word recognition by young Spanish-English bilinguals [O] . Virginia A. Marchman, Anne Fernald, Nereyda Hurtado -1

机译：如何词汇量的大小两种语言由年轻的西班牙 - 英双语涉及口头语言识别效率
7. THE COMPARATIVE STUDY BETWEEN THE STUDENTS ACHIEVEMENTudIN PRODUCING ENGLISH SPOKEN LANGUAGE WITH USINGudCOMMUNICATIVE LANGUAGE TEACHING AND AUDIO LINGUALudMETHOD AT SENIOR HIGH SCHOOL STUDENTS OF MA YATAMUudPASAWAHAN - CIREBON [O] . IBNU UBAIDILLAH 2012

机译：学生成绩之间的比较研究 ud使用 ud制作英语口语交流语言教学和音频语言 ud马雅塔姆高中生的学习方法 ud帕萨瓦汗-芝加哥
8. Open-Source Multi-Language Audio Database for Spoken Language Processing Applications. [R] . Zahorian, S. 2012

机译：用于语言处理应用的开源多语言音频数据库。

KALAKA-3: a database for the recognition of spoken European languages on YouTube audios

摘要

著录项

相似文献

相关主题

期刊订阅