首页> 外文会议>Annual conference of the International Speech Communication Association >Normalization of Text Messages Using Character- and Phone-based Machine Translation Approaches
【24h】

Normalization of Text Messages Using Character- and Phone-based Machine Translation Approaches

机译:使用基于字符和电话的机器翻译方法对文本消息进行规范化

获取原文

摘要

There are many abbreviation and non-standard words in SMS and Twitter messages. They are problematic for text-to-speech (TTS) or language processing techniques for these data. A character-based machine translation (MT) approach was previously used for normalization of non-standard words. In this paper, we propose a two-stage translation method to leverage phonetic information, where non-standard words are first translated to possible pronunciations, which are then translated to standard words. We further combine it with the single-step character-based translation module. Our experiments show that our proposed method significantly outperforms previous results in both n-best coverage and 1-best accuracy.
机译:SMS和Twitter消息中有许多缩写词和非标准词。对于这些数据的文本语音转换(TTS)或语言处理技术,它们是有问题的。以前使用基于字符的机器翻译(MT)方法对非标准单词进行标准化。在本文中,我们提出了一种利用语音信息的两阶段翻译方法,即先将非标准单词翻译为可能的发音,然后再将其翻译为标准单词。我们进一步将其与基于字符的单步翻译模块结合在一起。我们的实验表明,我们提出的方法在n-best覆盖率和1-best准确性方面均明显优于先前的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号