首页> 外文会议>Signal Processing and Communications Applications Conference >Developing an automatic transcription and retrieval system for spoken lectures in Turkish
【24h】

Developing an automatic transcription and retrieval system for spoken lectures in Turkish

机译:开发土耳其语口语课的自动转录和检索系统

获取原文

摘要

With the increase of online video lectures, using speech and language processing technologies for education has become quite important. This paper presents an automatic transcription and retrieval system developed for processing spoken lectures in Turkish. The main steps in the system are automatic transcription of Turkish video lectures using a large vocabulary continuous speech recognition (LVCSR) system and finding keywords on the lattices obtained from the LVCSR system using a speech retrieval system based on keyword search. While developing this system, first a state-of-the-art LVCSR system was developed for Turkish using advance acoustic modeling methods, then keywords were extracted automatically from word sequences in the reference transcriptions of video lectures, and a speech retrieval system was developed for searching these keywords in the lattice output of the LVCSR system. The spoken lecture processing system yields 14.2% word error rate and 0.86 maximum term weighted value on the test data.
机译:随着在线视频讲座的增加,使用语音和语言处理技术进行教育已经变得非常重要。本文介绍了一种自动转录和检索系统,该系统是为处理土耳其语口语而开发的。该系统的主要步骤是使用大型词汇连续语音识别(LVCSR)系统自动转录土耳其视频讲座,并使用基于关键字搜索的语音检索系统在从LVCSR系统获得的格子上查找关键字。在开发此系统时,首先使用先进的声学建模方法为土耳其语开发了最先进的LVCSR系统,然后从视频讲座的参考转录中的单词序列中自动提取了关键字,然后开发了语音检索系统在LVCSR系统的点阵输出中搜索这些关键字。口语演讲处理系统在测试数据上产生14.2%的单词错误率和0.86的最大项加权值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号