首页> 外文会议>International conference on text, speech and dialogue >Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems
【24h】

Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems

机译:用于多语言自动语音识别系统自动培训的国会档案

获取原文

摘要

In the paper we present a fully automated process capable of creating speech databases needed for training acoustic models for speech recognition systems. We show that archives of national parliaments are perfect sources of speech and text data suited for a lightly supervised training scheme, which does not require human intervention. We describe the process and its procedures in details and demonstrate its usage on three Slavic languages (Polish, Russian and Bulgarian). Practical evaluation is done on a broadcast news task and yields better results than those obtained on some established speech databases.
机译:在本文中,我们提出了一种完全自动化的过程,该过程能够创建训练语音识别系统的声学模型所需的语音数据库。我们表明,国民议会档案馆是语音和文本数据的完美来源,适合于轻度监督的培训计划,而无需人工干预。我们将详细描述该过程及其程序,并在三种斯拉夫语言(波兰语,俄语和保加利亚语)上演示其用法。实际评估是在广播新闻任务上进行的,比在某些已建立的语音数据库上获得的结果更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号