【24h】

Joint tokenization, parsing, and translation

机译:联合令牌化,解析和翻译

获取原文

摘要

Natural language processing is all about ambiguities. In machine translation, tokenization and parsing mistakes due to segmentation and structural ambiguities potentially introduce translation errors. A well-known solution is to provide more alternatives by using compact representations such as lattice and forest. In this talk, I will introduce a technique that goes beyond using lattices and forests, which integrates tokenization, parsing, and translation in one system. Therefore, tokenization, parsing, and translation can interact with and benefit each other in a discriminative framework. Experimental results show that such integration significantly improves tokenization and translation performance.
机译:自然语言处理就是关于歧义的。在机器翻译中,由于分段和结构歧义而导致的标记化和解析错误可能会引入翻译错误。众所周知的解决方案是通过使用诸如格和林之类的紧凑表示来提供更多替代方案。在本演讲中,我将介绍一种超越使用格和林的技术,该技术将令牌化,解析和转换集成在一个系统中。因此,记号化,解析和转换可以在歧视性框架中相互影响并互惠互利。实验结果表明,这种集成显着改善了标记化和翻译性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号