首页> 外文会议>International Symposium on Chinese Spoken Language Processing >Approaches to Improving Recognition of Underrepresented Named Entities in Hybrid ASR Systems
【24h】

Approaches to Improving Recognition of Underrepresented Named Entities in Hybrid ASR Systems

机译:改善混合ASR系统中的代表性名录实体的识别方法

获取原文

摘要

In this paper, we present a series of complementary approaches to improve the recognition of underrepresented named entities (NE) in hybrid ASR systems without compromising overall word error rate performance. The underrepresented words correspond to rare or out-of-vocabulary (OOV) words in the training data, and thereby can't be modeled reliably. We begin with graphemic lexicon which allows to drop the necessity of phonetic models in hybrid ASR. We study it under different settings and demonstrate its effectiveness in dealing with underrepresented NEs. Next, we study the impact of neural language model (LM) with letter-based features derived to handle infrequent words. After that, we attempt to enrich representations of underrepresented NEs in pretrained neural LM by borrowing the embedding representations of rich-represented words. This let us gain significant performance improvement on underrepresented NE recognition. Finally, we boost the likelihood scores of utterances containing NEs in the word lattices rescored by neural LMs and gain further performance improvement. The combination of the aforementioned approaches improves NE recognition by up to 42% relatively.
机译:在本文中,我们提出了一系列互补方法,以改善混合ASR系统中的识别不足的命名实体(NE),而不会影响整体误差率性能。不足的单词对应于训练数据中的稀有或失入词汇(OOV)单词,从而不能可靠地建模。我们从图形词典开始,允许在混合动力ASR中降低语音模型的必要性。我们在不同的环境下研究,并展示其在处理不足的NES中的有效性。接下来,我们研究神经语言模型(LM)与基于信件的特征的影响,导出以处理不频繁的单词。之后,我们试图通过借用嵌入代表的富人代表的单词来丰富预见的NE的代表性。这让我们对不足的NE认可提高了显着的性能改进。最后,我们提高了通过神经LMS重生的单词格子中包含NES的话语的可能性得分,并获得进一步的性能改进。上述方法的组合将NE识别提高到42%相对。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号