【24h】

Compound decomposition in Dutch large vocabulary speech recognition

机译:荷兰大词汇语音识别中的复合分解

获取原文
获取原文并翻译 | 示例

摘要

This paper addresses compound splitting for Dutch in the context of broadcast news transcription. Language models were created using original text versions and text versions that were decomposed using a data-driven compound splitting algorithm. Language model performances were compared in terms of out-of-vocabulary rates and word error rates in a real-world broadcast news transcription task. It was concluded that compound splitting does improve ASR performance. Best results were obtained when frequent compounds were not decomposed.
机译:本文针对广播新闻转录中荷兰语的复合拆分问题。语言模型是使用原始文本版本创建的,而文本版本是使用数据驱动的复合拆分算法分解的文本版本。在实际广播新闻转录任务中,根据语音输出率和单词错误率对语言模型的性能进行了比较。结论是,化合物分裂确实改善了ASR性能。当常用化合物不分解时,可获得最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号