Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation

ISAO GOTO; MASAO UTIYAMA; EIICHIRO SUMITA; SADAO KUROHASHI

首页> 外文期刊>ACM transactions on Asian language information processing >Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation

【24h】

Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation

机译：使用目标语言解析器通过跨语言句法投影对统计机器翻译进行预排序

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

When translating between languages with widely different word orders, word reordering can present a major challenge. Although some word reordering methods do not employ source-language syntactic structures, such structures are inherently useful for word reordering. However, high-quality syntactic parsers are not available for many languages. We propose a preordering method using a target-language syntactic parser to process source-language syntactic structures without a source-language syntactic parser. To train our preordering model based on ITG, we produced syntactic constituent structures for source-language training sentences by (1) parsing target-language training sentences, (2) projecting constituent structures of the target-language sentences to the corresponding source-language sentences, (3) selecting parallel sentences with highly synchronized parallel structures, (4) producing probabilistic models for parsing using the projected partial structures and the Pitman-Yor process, and (5) parsing to produce full binary syntactic structures maximally synchronized with the corresponding target-language syntactic structures, using the constraints of the projected partial structures and the probabilistic models. Our ITG-based preordering model is trained using the produced binary syntactic structures and word alignments. The proposed method facilitates the learning of ITG by producing highly synchronized parallel syntactic structures based on cross-language syntactic projection and sentence selection. The preordering model jointly parses input sentences and identifies their reordered structures. Experiments with Japanese-English and Chinese-English patent translation indicate that our method outperforms existing methods, including string-to-tree syntax-based SMT, a preordering method that does not require a parser, and a preordering method that uses a source-language dependency parser.

机译：在具有不同词序的语言之间进行翻译时，词的重新排序可能会带来很大的挑战。尽管某些单词重排序方法不使用源语言语法结构，但此类结构固有地对单词重排序有用。但是，高质量的语法分析器不适用于许多语言。我们提出了一种使用目标语言句法解析器的预排序方法来处理源语言句法结构，而无需使用源语言句法解析器。为了训练基于ITG的预排序模型，我们通过（1）解析目标语言训练句子，（2）将目标语言句子的构成结构投影到相应的源语言句子，生成了源语言训练句子的句法构成结构，（3）选择具有高度同步并行结构的并行句子，（4）使用投影的部分结构和Pitman-Yor过程生成概率模型进行解析，以及（5）解析以产生与相应目标最大同步的完整二进制语法结构语言语言句法结构，使用预计的部分结构和概率模型的约束。我们基于ITG的预排序模型是使用产生的二进制句法结构和单词对齐方式进行训练的。所提出的方法通过基于跨语言句法投影和句子选择产生高度同步的并行句法结构，促进了ITG的学习。预排序模型联合解析输入的句子并标识其重新排序的结构。进行日英和汉英专利翻译的实验表明，我们的方法优于现有方法，包括基于字符串到树语法的SMT，不需要解析器的预排序方法以及使用源语言的预排序方法依赖解析器。

著录项

来源
《ACM transactions on Asian language information processing》 |2015年第3期|13.1-13.23|共23页
作者
ISAO GOTO; MASAO UTIYAMA; EIICHIRO SUMITA; SADAO KUROHASHI;
展开▼
作者单位

National Institute of Information and Communications Technology, NHK, and Kyoto University,NHK Science & Technology Research Laboratories, 1-10-11 Kinuta, Setagaya-ku, Tokyo 157-8510, Japan;

Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto 619-0289, Japan;

Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto 619-0289, Japan;

Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501, Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Preordering; syntactic projection; constituent structure; inversion trans-duction grammar;

机译：预购;句法投射构成结构反转转导语法;

相似文献

外文文献
中文文献
专利

1. Syntactic parsing of clause constituents for statistical machine translation [J] . Jianjun Ma, Jiahuan Pei, Degen Huang, International Journal of Computational Science and Engineering . 2018,第1期

机译：统计机器翻译子句组分的句法解析
2. Preordering for Chinese-Vietnamese Statistical Machine Translation [J] . Huu-Anh TRAN, Heyan HUANG, Phuoc TRAN, IEICE transactions on information and systems . 2019,第2期

机译：汉越统计机器翻译的预订
3. Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval [J] . FERHAN TURE, JIMMY LIN ACM Transactions on Information Systems . 2014,第4期

机译：利用统计机器翻译中的表示形式进行跨语言信息检索
4. Cross-language Projection of Dependency Trees with Constrained Partial Parsing for Tree-to-Tree Machine Translation [C] . Yu Shen, Chenhui Chu, Fabien Cromieres, First conference on machine translation . 2016

机译：树到树机器翻译的带约束部分解析的依赖树跨语言投影
5. Learning for semantic parsing and natural language generation using statistical machine translation techniques. [D] . Wong, Yuk Wah. 2007

机译：使用统计机器翻译技术学习语义解析和自然语言生成。
6. Machine Translation-Supported Cross-Language Information Retrieval for a Consumer Health Resource [O] . Graciela Rosemblat, Darren Gemoets, Allen C. Browne, 2003

机译：消费者健康资源的机器翻译支持的跨语言信息检索
7. Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation [O] . Isao Goto, Masao Utiyama, Eiichiro Sumita, 2015

机译：通过用于统计机器翻译的跨语法句法投影使用目标语言解析器

Preordering using a Target-Language Parser via Cross-Language Syntactic Projection for Statistical Machine Translation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅