首页> 外文会议>Language and technology conference >Temporal and Lexical Context of Diachronic Text Documents for Automatic Out-Of-Vocabulary Proper Name Retrieval

【24h】

Temporal and Lexical Context of Diachronic Text Documents for Automatic Out-Of-Vocabulary Proper Name Retrieval

机译：历时文本文档的时间和词法上下文，用于自动词汇外专有名称检索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Proper name recognition is a challenging task in information retrieval from large audio/video databases. Proper names are semantically rich and are usually key to understanding the information contained in a document. Our work focuses on increasing the vocabulary coverage of a speech transcription system by automatically retrieving proper names from contemporary diachronic text documents. We proposed methods that dynamically augment the automatic speech recognition system vocabulary using lexical and temporal features in diachronic documents. We also studied different metrics for proper name selection in order to limit the vocabulary augmentation and therefore the impact on the ASR performances. Recognition results show a significant reduction of the proper name error rate using an augmented vocabulary.

机译：在从大型音频/视频数据库检索信息时，正确的名称识别是一项具有挑战性的任务。专有名称的语义丰富，通常是理解文档中包含的信息的关键。我们的工作集中在通过自动检索当代历时文本文档中的专有名称来增加语音转录系统的词汇覆盖率。我们提出了使用历时文档中的词汇和时态特征动态增加自动语音识别系统词汇的方法。我们还研究了用于适当名称选择的不同指标，以限制词汇量的增长，从而限制对ASR性能的影响。识别结果表明，使用增强的词汇可以显着降低专有名称的错误率。

著录项

来源
《Language and technology conference》|2016年|41-54|共14页
会议地点
作者
Irina Illina; Dominique Fohr; Georges Linares; Imane Nkairi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech recognition; Out-of-vocabulary words; Proper names; Vocabulary augmentation;

机译：语音识别;词汇外的单词;恰当的名字;词汇扩充;

相似文献

外文文献
中文文献
专利

1. Automatic Text Document Summarization Using Graph Based Centrality Measures on Lexical Network [J] . Chandra Shakhar Yadav, Aditi Sharan International journal of information retrieval research . 2018,第3期

机译：在词法网络上使用基于图的集中度度量的自动文本文档摘要
2. Information Retrieval from Unstructured Web Text Document Based on Automatic Learning of the Threshold [J] . Fethi Fkih, Mohamed Nazih Omri International journal of information retrieval research . 2012,第4期

机译：基于阈值自动学习的非结构化Web文本文档信息检索
3. The phrase-based vector space model for automatic retrieval of free-text medical documents [J] . Wenlei Mao, Wesley W. Chu Data & Knowledge Engineering . 2007,第1期

机译：自动检索自由文本医学文档的基于短语的向量空间模型
4. Temporal and Lexical Context of Diachronic Text Documents for Automatic Out-Of-Vocabulary Proper Name Retrieval [C] . Irina Illina, Dominique Fohr, Georges Linares, Language and Technology Conference . 2016

机译：探讨文本文档的时间和词汇背景，用于自动失败的正确名称检索
5. Text document topical recursive clustering and automatic labeling of a hierarchy of document clusters. [D] . Li, Xiaoxiao. 2012

机译：文本文档主题递归群集和文档群集层次结构的自动标记。
6. Automating the generation of lexical patterns for processing free text in clinical documents [O] . Frank Meng, Craig Morioka 2015

机译：自动生成词汇模式以处理临床文档中的自由文本
7. Detecting and assessing contextual change in diachronic text documents using context volatility [O] . Kahmann, Christian, Niekler, Andreas, Heyer, Gerhard 2017

机译：检测和评估历时文本文档中的上下文变化使用上下文波动率
8. Model-based Analysis of Associative Recognition, Temporal Context and Retrieval. [R] . R. Sekuler 2013

机译：基于模型的关联识别，时间语境和检索分析。

Temporal and Lexical Context of Diachronic Text Documents for Automatic Out-Of-Vocabulary Proper Name Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅