首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Syntax-Based Translation With Bilingually Lexicalized Synchronous Tree Substitution Grammars
【24h】

Syntax-Based Translation With Bilingually Lexicalized Synchronous Tree Substitution Grammars

机译:基于语法的双语词汇化同步树替换语法的翻译

获取原文
获取原文并翻译 | 示例

摘要

Syntax-based models can significantly improve the translation performance due to their grammatical modeling on one or both language side(s). However, the translation rules such as the non-lexical rule “ ${rm VP}rightarrow({rm x}_{0}{rm x}_{1},{rm VP}:{rm x}_{1}{rm PP}:{rm x}_{0})$” in string-to-tree models do not consider any lexicalized information on the source or target side. The rule is so generalized that any subtree rooted at VP can substitute for the nonterminal ${rm VP}:{rm x}_{1}$. Because rules containing nonterminals are frequently used when generating the target-side tree structures, there is a risk that rules of this type will potentially be severely misused in decoding due to a lack of lexicalization guidance. In this article, inspired by lexicalized PCFG, which is widely used in monolingual parsing, we propose to upgrade the STSG (synchronous tree substitution grammars)-based syntax translation model with bilingually lexicalized STSG. Using the string-to-tree translation model as a case study, we present generative and discriminative models to integrate lexicalized STSG into the translation model. Both small- and large-scale experiments on Chinese-to-English translation demonstrate that the proposed lexicalized STSG can provide superior rule selection in decoding and substantially improve the translation quality.
机译:由于基于语法的模型在一种或两种语言方面的语法建模,因此可以显着提高翻译性能。但是,翻译规则例如非词法规则“ $ {rm VP} rightarrow({rm x} _ {0} {rm x} _ { 1},{rm VP}:{rm x} _ {1} {rm PP}:{rm x} _ {0})$ ”在字符串到树模型中不被考虑源或目标方的任何词汇化信息。该规则是如此概括,以致于以VP为根的任何子树都可以代替非终结符 $ {rm VP}:{rm x} _ {1} $ 。因为在生成目标方树形结构时经常使用包含非终结符的规则,所以存在这样的风险,即由于缺乏词汇化指导,这种类型的规则可能会在解码中被严重滥用。在本文中,受广泛用于单语种分析的词汇化PCFG的启发,我们建议使用双语词汇化的STSG升级基于STSG(同步树替换语法)的语法转换模型。使用字符串到树的翻译模型作为案例研究,我们提出了生成模型和判别模型,以将词汇化的STSG集成到翻译模型中。汉英翻译的小规模和大规模实验均表明,该词法化的STSG可以在解码中提供更好的规则选择,并显着提高翻译质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号