首页> 外文会议>International Speech Communication Association >A Language-Modeling Approach to Inverse Text Normalization and DataCleanup for Multimodal Voice Search Applications
【24h】

A Language-Modeling Approach to Inverse Text Normalization and DataCleanup for Multimodal Voice Search Applications

机译:用于多模式语音搜索应用的逆文本归一化和DataCleanup的语言建模方法

获取原文

摘要

In this paper we address two related challenges in multimodal local search applications on mobile devices: first, correctly displaying the business names, and second, harvesting language model training data from an inconsistently labeled corpus. We investigate the impact of common text normalization and the quality of language model training corpus on the accuracy of displayed results. We propose a new language model framework that eliminates the need for explicit inverse text normalization. The same framework can be applied to sift through corrupted language model training data. Our new language model is 25% more accurate while 25% smaller in size.
机译:在本文中,我们在移动设备上为多模式本地搜索应用中解决了两个相关挑战:首先,正确显示业务名称,第二个,从不一致标记的语料库中收集语言模型培训数据。我们调查常见文本规范化的影响和语言模型培训语料库的影响,以表现出现的准确性。我们提出了一种新的语言模型框架,消除了对明确的逆文本归一代的需要。可以应用于通过损坏的语言模型培训数据筛选相同的框架。我们的新语言模型更准确,而25%的大小较小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号