首页> 外文会议>Telecommunications and Signal Processing (TSP), 2012 35th International Conference on >Post-processing of the recognized speech for web presentation of large audio archive
【24h】

Post-processing of the recognized speech for web presentation of large audio archive

机译:对大型语音档案的Web演示进行识别语音的后处理

获取原文
获取原文并翻译 | 示例

摘要

This paper deals with a post-processing phase of automatic transcription of spoken documents stored in the large Czech Radio audio archive (containing hundreds of thousands of recordings). The ultimate goal of the project is to transcribe them and to allow public access to their content. In this paper we focus on methods and algorithms for unsupervised post-processing of automatically recognized recordings. The post-processing is adapted for the needs of the web presentation of the archive. Up to now it has been used to process about 60,000 audio documents. We present the overall structure of the system as well as its core modules - speech recognition engine, speaker diarization module and final text processing. Special attention is paid to the punctuation issue. The punctuation accuracy is evaluated and compared to human use. In the final part of the paper we propose further improvements and ideas for the future research.
机译:本文涉及存储在大型捷克广播音频档案库(包含数十万个录音)中的语音文档自动转录的后期处理阶段。该项目的最终目标是转录它们并允许公众访问其内容。在本文中,我们着重于对自动识别的记录进行无监督后处理的方法和算法。后处理适合存档的Web演示的需要。到目前为止,它已用于处理约60,000个音频文档。我们介绍了系统的整体结构及其核心模块-语音识别引擎,说话者差异化模块和最终文本处理。特别注意标点符号问题。评估标点的准确性并将其与人类使用进行比较。在本文的最后部分,我们为将来的研究提出了进一步的改进和思路。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号