首页> 外文会议>7th International Conference on Natural Language Processing and Knowledge Engineering >Automatic speech recognition for closed-captioning of Filipino news broadcasts
【24h】

Automatic speech recognition for closed-captioning of Filipino news broadcasts

机译:自动语音识别,用于菲律宾新闻广播的字幕

获取原文
获取原文并翻译 | 示例

摘要

In this paper, the development of a closed captioning system for Filipino TV news programs is discussed. The researchers tested the system for offline captioning and evaluated the performance of the system based on word error rate (WER). Carnegie Mellon University's open-source speech recognition system, Sphinx-III, was used as the primary training and recognition engine. A Filipino News Corpus was built consisting of speech and text data obtained from Filipino news videos. Training and testing sets were generated and from this, different training and decoding parameters of Sphinx were evaluated. Using the word error rate (WER) computation, the highest average recognition accuracy achieved in developing for the test set was 57.36% using flat start context-dependent models and a language model with absolute discounting applied. This project is a first step towards establishing the baseline accuracy for future development of the system.
机译:本文讨论了菲律宾电视新闻节目的隐藏字幕系统的开发。研究人员对系统的离线字幕进行了测试,并根据误码率(WER)评估了系统的性能。卡内基梅隆大学的开源语音识别系统Sphinx-III被用作主要的训练和识别引擎。建立了一个菲律宾新闻语料库,其中包含从菲律宾新闻视频中获得的语​​音和文本数据。生成了训练和测试集,并由此评估了Sphinx的不同训练和解码参数。使用单词错误率(WER)计算,在使用平坦起始上下文相关模型和应用了绝对折扣的语言模型的情况下,开发测试集所获得的最高平均识别精度为57.36%。该项目是确定系统未来开发基准精度的第一步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号