Out-of-vocabulary word modeling by using sub-word units

Shigehiko Onishi; Hiroaki Kokubo; Hirofumi Yamamoto; Yoshinori Sagisaka

首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Out-of-vocabulary word modeling by using sub-word units

【24h】

Out-of-vocabulary word modeling by using sub-word units

机译：使用子词单元进行词汇外词建模

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A structured language model (STLM) is proposed to cope with out-of-vocabulary (OOV) words coming from multiple word-classes. The STLM aims at independently modeling the classes without interference and identifying the class of words arising from multiple word-classes. The STLM consists of the conventional word-class N-gram and the sets of the independent-trained class-specific sub-word N-grams. We made an experimental language model by using STLM for the two similar proper-noun classes and performed the speech recognition experiments. The results show that any OOV word of the one class is never misrecognized as that of the other class. The results show that the STLM could integrate the multiple different statistical language models with no interference.

机译：提出了一种结构化语言模型（STLM）来应对来自多个单词类别的词汇外（OOV）单词。 STLM的目标是在不干扰的情况下对类别进行独立建模，并识别由多个单词类别引起的单词类别。 STLM由常规单词类N-gram和独立训练的特定于类的子单词N-gram的集合组成。我们通过将STLM用于两个相似的专有名词类来建立实验语言模型，并进行了语音识别实验。结果表明，一类的任何OOV单词都不会像另一类的一样被误认。结果表明，STLM可以不受干扰地集成多种不同的统计语言模型。

著录项

来源
《電子情報通信学会技術研究報告. 音声. Speech》 |2001年第31期|共7页
作者
Shigehiko Onishi; Hiroaki Kokubo; Hirofumi Yamamoto; Yoshinori Sagisaka;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类电报、传真;
关键词
LVCSR out-of-vocabulary words; Language model; Structured; Hierarchical;

机译：LVCSR词汇外语;语言模型;结构化;层次化;

相似文献

外文文献
中文文献
专利

1. Out-of-vocabulary word modeling by using sub-word units [J] . Shigehiko Onishi, Hiroaki Kokubo, Hirofumi Yamamoto, 電子情報通信学会技術研究報告. 音声. Speech . 2001,第31期

机译：使用子词单元进行词汇外词建模
2. Out-of-vocabulary word modeling by using sub-word units [J] . Shigehiko Onishi, Hiroaki Kokubo, Hirofumi Yamamoto, 電子情報通信学会技術研究報告. 音声. Speech . 2001,第31期

机译：使用子字单元的词汇形状建模
3. Out-of-vocabulary word modeling by using sub-word units [J] . Shigehiko Onishi, Hiroaki Kokubo, Hirofumi Yamamoto, 電子情報通信学会技術研究報告. 音声. Speech . 2001,第32期

机译：使用子字单元的词汇形状建模
4. Evaluating Modeling Units and Sub-word Features in Language Models for Turkish ASR [C] . Chang Liu, Yike Zhang, Pengyuan Zhang, International Symposium on Chinese Spoken Language Processing . 2018

机译：评估土耳其ASR语言模型中的建模单位和子词特征
5. Learning sub-word units and exploiting contextual information for open vocabulary speech recognition. [D] . Parada, Maria Carolina. 2011

机译：学习子词单位并利用上下文信息进行开放式词汇语音识别。
6. A backwards glance at words: Using reversed-interior masked primes to test models of visual word identification [O] . Colin J. Davis, Stephen J. Lupker 2011

机译：向后看一眼单词：使用反向内部掩蔽素数测试视觉单词识别模型
7. Domain adaptation challenges of BERT in tokenization and sub-word representations of Out-of-Vocabulary words [O] . Anmol Nayak, Hariprasad Timmapathini, Karthikeyan Ponnalagu, 2020

机译：展示伯特牌的调整挑战与词汇外单词的子字表示

Out-of-vocabulary word modeling by using sub-word units

摘要

著录项

相似文献

相关主题

期刊订阅