首页> 外文会议>Telecommunications and Signal Processing (TSP), 2012 35th International Conference on >Automatic transcription and speech recognition of Romanian corpus RO-GRID
【24h】

Automatic transcription and speech recognition of Romanian corpus RO-GRID

机译:罗马尼亚语语体RO-GRID的自动转录和语音识别

获取原文
获取原文并翻译 | 示例

摘要

The results reported in this paper assess the ability of Hidden Markov Model (HMM) based method to generate accurate and reliable automatic phone-level transcriptions for a small vocabulary speech corpus such as RO-GRID. The system requires only orthographic transcription of the target corpus, and can be bootstrapped from models trained just on few amount of data in the transcribed corpus. For this purpose, an automatic time-aligned phone transcription toolbox has been developed and tested on the Romanian corpus and also validated on an English corpus. The quality of transcriptions is judged by evaluating the statistical parameters of the error between the automatic and manual transcription. The transcriptions generated from the most reliable system deviate from the average manual transcription by an average of 20 ms. The system is also able to convert the generated transcription from HTK format into PRAAT format for further manipulation of the speech signal.
机译:本文报道的结果评估了基于隐马尔可夫模型(HMM)的方法为小型词汇语音语料库(例如RO-GRID)生成准确而可靠的自动电话级转录的能力。该系统仅需要对目标语料进行正字法转录,并且可以从仅对转录语料中的少量数据进行训练的模型进行引导。为此,已经开发了一种自动时间对齐电话转录工具箱,并在罗马尼亚语语料库上进行了测试,并在英语语料库上进行了验证。通过评估自动和手动转录之间的错误的统计参数来判断转录的质量。从最可靠的系统生成的转录与平均手动转录的平均偏差为20 ms。该系统还能够将生成的转录从HTK格式转换为PRAAT格式,以便进一步处理语音信号。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号