【24h】

Compositional Machine Transliteration

机译:合成机音译

获取原文
获取原文并翻译 | 示例
           

摘要

Machine transliteration is an important problem in an increasingly multilingual world, as it plays a critical role in many downstream applications, such as machine translation or crosslingual information retrieval systems. In this article, we propose compositional machine transliteration systems, where multiple transliteration components may be composed either to improve existing transliteration quality, or to enable transliteration functionality between languages even when no direct parallel names corpora exist between them. Specifically, we propose two distinct forms of composition: serial and parallel. Serial compositional system chains individual transliteration components, say, X → Y and Y →Z systems, to provide transliteration functionality, X → Z. In parallel composition evidence from multiple transliteration paths between X → Z are aggregated for improving the quality of a direct system. We demonstrate the functionality and performance benefits of the compositional methodology using a state-of-the-art machine transliteration framework in English and a set of Indian languages, namely, Hindi, Marathi, and Kannada. Finally, we underscore the utility and practicality of our compositional approach by showing that a CLIR system integrated with compositional transliteration systems performs consistently on par with, and sometimes better than, that integrated with a direct transliteration system.
机译:在越来越多的语言世界中,机器音译是一个重要的问题,因为它在许多下游应用程序(例如机器翻译或跨语言信息检索系统)中扮演着至关重要的角色。在本文中,我们提出了组合式机器音译系统,其中可以组合多个音译组件来提高现有音译质量,或者即使语言之间不存在直接的平行名称语料库也可以在语言之间启用音译功能。具体来说,我们提出了两种不同的组合形式:串行和并行。串行成分系统将各个音译组件(例如X→Y和Y→Z系统)链接在一起,以提供音译功能X→Z。在并行合成中,来自X→Z之间多个音译路径的证据被汇总,以改善直接系统的质量。我们使用英语和一组印度语(印地语,马拉地语和卡纳达语)使用最新的机器音译框架,展示了合成方法的功能性和性能优势。最后,我们通过显示与成分音译系统集成的CLIR系统与直接音译系统集成的系统性能一致,有时甚至更好,突显了合成方法的实用性和实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号