首页> 外文会议> >Speech recognition on Mandarin Call Home: a large-vocabulary, conversational, and telephone speech corpus
【24h】

Speech recognition on Mandarin Call Home: a large-vocabulary, conversational, and telephone speech corpus

机译:普通话回拨电话上的语音识别:大型语音,会话和电话语音语料库

获取原文

摘要

We describe IBM's most recent efforts for speech recognition on a conversational-speech database, the Mandarin Call Home corpus. While it is similar to the well-known Switchboard corpus, the Call Home task addresses several major challenges in the domain of spoken language systems, including spontaneous dialogue with no pre-specified topics, limited-bandwidth telephone signal, and recognition of other languages than English. We particularly describe the methodology used in Mandarin Call Home corpus to address language-specific issues. We also examine and compare our results with those of the English Switchboard corpus. Preliminary experiments show that a 58.7% character error rate can be achieved in the context of April 95 Mandarin Call Home data set. The experimental results are comparable to those of the state-of-the-art IBM Switchboard system with similar amount of training data.
机译:我们在对话语音数据库Mandarin Call Home语料库上描述了IBM最近为语音识别所做的努力。尽管它与著名的总机总成语料库相似,但“回拨电话”任务解决了口头语言系统领域中的几个主要挑战,包括没有预先指定主题的自发对话,有限带宽的电话信号以及对其他语言的识别。英语。我们特别描述普通话回拨语语料库中用于解决特定于语言的问题的方法。我们还将检查结果并将其与English Switchboard语料库的结果进行比较。初步实验表明,在95年4月普通话回拨数据集的情况下,可以达到58.7%的字符错误率。实验结果与具有类似训练数据量的最新IBM Switchboard系统的结果相当。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号