首页> 外文会议>Language and technology conference >Temporal and Lexical Context of Diachronic Text Documents for Automatic Out-Of-Vocabulary Proper Name Retrieval
【24h】

Temporal and Lexical Context of Diachronic Text Documents for Automatic Out-Of-Vocabulary Proper Name Retrieval

机译:历时文本文档的时间和词法上下文,用于自动词汇外专有名称检索

获取原文

摘要

Proper name recognition is a challenging task in information retrieval from large audio/video databases. Proper names are semantically rich and are usually key to understanding the information contained in a document. Our work focuses on increasing the vocabulary coverage of a speech transcription system by automatically retrieving proper names from contemporary diachronic text documents. We proposed methods that dynamically augment the automatic speech recognition system vocabulary using lexical and temporal features in diachronic documents. We also studied different metrics for proper name selection in order to limit the vocabulary augmentation and therefore the impact on the ASR performances. Recognition results show a significant reduction of the proper name error rate using an augmented vocabulary.
机译:在从大型音频/视频数据库检索信息时,正确的名称识别是一项具有挑战性的任务。专有名称的语义丰富,通常是理解文档中包含的信息的关键。我们的工作集中在通过自动检索当代历时文本文档中的专有名称来增加语音转录系统的词汇覆盖率。我们提出了使用历时文档中的词汇和时态特征动态增加自动语音识别系统词汇的方法。我们还研究了用于适当名称选择的不同指标,以限制词汇量的增长,从而限制对ASR性能的影响。识别结果表明,使用增强的词汇可以显着降低专有名称的错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号