首页> 外文期刊>ACM transactions on Asian language information processing >Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation
【24h】

Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation

机译:使用目标语言解析器通过跨语言句法投影对统计机器翻译进行预排序

获取原文
获取原文并翻译 | 示例

摘要

When translating between languages with widely different word orders, word reordering can present a major challenge. Although some word reordering methods do not employ source-language syntactic structures, such structures are inherently useful for word reordering. However, high-quality syntactic parsers are not available for many languages. We propose a preordering method using a target-language syntactic parser to process source-language syntactic structures without a source-language syntactic parser. To train our preordering model based on ITG, we produced syntactic constituent structures for source-language training sentences by (1) parsing target-language training sentences, (2) projecting constituent structures of the target-language sentences to the corresponding source-language sentences, (3) selecting parallel sentences with highly synchronized parallel structures, (4) producing probabilistic models for parsing using the projected partial structures and the Pitman-Yor process, and (5) parsing to produce full binary syntactic structures maximally synchronized with the corresponding target-language syntactic structures, using the constraints of the projected partial structures and the probabilistic models. Our ITG-based preordering model is trained using the produced binary syntactic structures and word alignments. The proposed method facilitates the learning of ITG by producing highly synchronized parallel syntactic structures based on cross-language syntactic projection and sentence selection. The preordering model jointly parses input sentences and identifies their reordered structures. Experiments with Japanese-English and Chinese-English patent translation indicate that our method outperforms existing methods, including string-to-tree syntax-based SMT, a preordering method that does not require a parser, and a preordering method that uses a source-language dependency parser.
机译:在具有不同词序的语言之间进行翻译时,词的重新排序可能会带来很大的挑战。尽管某些单词重排序方法不使用源语言语法结构,但此类结构固有地对单词重排序有用。但是,高质量的语法分析器不适用于许多语言。我们提出了一种使用目标语言句法解析器的预排序方法来处理源语言句法结构,而无需使用源语言句法解析器。为了训练基于ITG的预排序模型,我们通过(1)解析目标语言训练句子,(2)将目标语言句子的构成结构投影到相应的源语言句子,生成了源语言训练句子的句法构成结构,(3)选择具有高度同步并行结构的并行句子,(4)使用投影的部分结构和Pitman-Yor过程生成概率模型进行解析,以及(5)解析以产生与相应目标最大同步的完整二进制语法结构语言语言句法结构,使用预计的部分结构和概率模型的约束。我们基于ITG的预排序模型是使用产生的二进制句法结构和单词对齐方式进行训练的。所提出的方法通过基于跨语言句法投影和句子选择产生高度同步的并行句法结构,促进了ITG的学习。预排序模型联合解析输入的句子并标识其重新排序的结构。进行日英和汉英专利翻译的实验表明,我们的方法优于现有方法,包括基于字符串到树语法的SMT,不需要解析器的预排序方法以及使用源语言的预排序方法依赖解析器。

著录项

  • 来源
  • 作者单位

    National Institute of Information and Communications Technology, NHK, and Kyoto University,NHK Science & Technology Research Laboratories, 1-10-11 Kinuta, Setagaya-ku, Tokyo 157-8510, Japan;

    Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto 619-0289, Japan;

    Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto 619-0289, Japan;

    Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501, Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Preordering; syntactic projection; constituent structure; inversion trans-duction grammar;

    机译:预购;句法投射构成结构反转转导语法;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号