首页> 外文会议>International University Communication Symposium >Translation table compression under End-Tagged Dense Code
【24h】

Translation table compression under End-Tagged Dense Code

机译:终端标记密集代码下的翻译表压缩

获取原文
获取外文期刊封面目录资料

摘要

In recent years, the quality of Phrase-Based Statistical Machine Translation has increased dramatically partially due to the significant increase of available parallel corpus. If we talk in terms of space, this advantage becomes a disadvantage because the increased size of the parallel corpus implies an exponential increase in the size of the translation tables. To solve this problem, there are solutions that reduce the size of the translation tables limiting the length of sentences that are incorporated into the tables. This solution reduces the space, but at the expense of increasing the possibility of worsening the translation of long sentences. In this paper, we propose the compression of the phrase-based translation tables using End-Tagged Dense Code to codify the phrases in source and target languages. The use of this technique allows us to reduce the size of translation tables and therefore it is possible to add longer sentences.
机译:近年来,由于可用并联语料库的显着增加,基于短语的统计机器翻译质量显着增加。如果我们在空间方面谈话,这种优势成为一个缺点,因为并行语料库的增加尺寸意味着翻译表的大小的指数增加。为了解决这个问题,有解决方案可以减少平移表的大小,限制了包含在表中的句子的长度。该解决方案减少了空间,但牺牲了增加恶化长句子翻译的可能性。在本文中,我们建议使用结束标记的密度代码来压缩基于短语的翻译表,以编码源语言和目标语言中的短语。这种技术的使用允许我们减少翻译表的大小,因此可以添加更长的句子。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号