Automatically finding semantically consistent n-grams to add new words in LVCSR systems

机译：自动查找语义一致的n元语法以在LVCSR系统中添加新单词

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a new method to automatically add n-grams containing out-of-vocabulary (OOV) words to a baseline language model (LM), where these n-grams are sought to be grammatically correct and to make sense according to the meaning of OOV words. First, this method consists in determining the word sequences, i.e., n-grams, in which the usage of a given OOV word is the most semantically consistent. Then, conditional probabilities of these n-grams have to be computed. To do this, semantic relations between words are used to assimilate each OOV word to several equivalent in-vocabulary words. Based on these last words, n-grams from the baseline LM are re-used to find the word sequences to be added and to compute their probabilities. After augmenting the vocabulary and launching a recognition process, experiments show that our method results in WER improvements which are comparable to those obtained using a state-of-the-art open vocabulary LM.

机译：本文提出了一种自动将包含语音（OOV）字词的n-gram添加到基线语言模型（LM）的新方法，在该方法中，这些n-gram寻求语法上的正确性并根据含义有意义OOV单词。首先，该方法在于确定单词序列，即n-gram，其中给定OOV单词的使用在语义上最一致。然后，必须计算这些n-gram的条件概率。为此，使用单词之间的语义关系将每个OOV单词同化为几个等效的词汇中单词。基于这些最后的单词，基线LM的n-gram被重复使用，以找到要添加的单词序列并计算其概率。在增加词汇量并启动识别过程之后，实验表明，我们的方法所产生的WER改进与使用最新的开放式词汇表LM所获得的改进相当。

著录项

来源
《2011 IEEE International Conference on Acoustics, Speech and Signal Processing》|2011年|p.4676-4679|共4页
会议地点
作者
Lecorve Gwenole; Gravier Guillaume; Sebillot Pascale;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信理论;
关键词
Automatic speech recognition; language modeling; natural language processing; vocabulary adaptation;

机译：自动语音识别;语言建模;自然语言处理;词汇适应;

相似文献

外文文献
中文文献
专利

1. Pseudo-Conventional N-Gram Representation of the Discriminative N-Gram Model for LVCSR [J] . Zhou Z., Meng H. Selected Topics in Signal Processing, IEEE Journal of . 2010,第6期

机译：LVCSR的判别性N-Gram模型的伪常规N-Gram表示
2. N-gram Approximation of Latent Words Language Models for Domain Robust Automatic Speech Recognition [J] . Ryo MASUMURA, Taichi ASAMI, Takanobu OBA, IEICE transactions on information and systems . 2016,第10期

机译：领域鲁棒自动语音识别的潜在词语言模型的N语法逼近
3. Dealing with Out-of vocabulary Words and Filled Pauses in Word N-gram Based Speech Recognition System [J] . ATSUHIKO KAI, YOSHIFUMI HIROSE, SEIICHI NAKAGAWA 情報処理学会論文誌 . 1999,第4期

机译：基于单词N-gram的语音识别系统处理词汇外单词和填充的暂停
4. AUTOMATICALLY FINDING SEMANTICALLY CONSISTENT N-GRAMS TO ADD NEW WORDS IN LVCSR SYSTEMS [C] . Gwenole Lecorve, Guillaume Gravier, Pascale Sebillot IEEE International Conference on Acoustics, Speech and Signal Processing . 2011

机译：自动查找语义一致的n-gram，可以在LVCSR系统中添加新单词
5. Automatic acquisition of lexical semantic knowledge from large corpora: The identification of semantically related words, markedness, polarity, and antonymy. [D] . Hatzivassiloglou, Vasileios. 1998

机译：从大型语料库自动获取词汇语义知识：识别与语义相关的单词，标记，极性和反义词。
6. Unconscious semantic processing of polysemous words is not automatic [O] . Benjamin Rohaut, F.-Xavier Alario, Jacqueline Meadow, 2016

机译：多义词的无意识语义处理不是自动的
7. Automatically Finding Semantically Consistent N-grams to Add New Words in LVCSR Systems [O] . Lecorvé, Gwénolé, Gravier, Guillaume, Sébillot, Pascale 2011

机译：在LVCSR系统中自动查找语义一致的N-gram以添加新单词

Automatically finding semantically consistent n-grams to add new words in LVCSR systems

摘要

著录项

相似文献

相关主题

期刊订阅