We introduce a method for learning to reorder source sentences. In our approach, sentences are transformed into new sequences of words aimed at reducing non-local reorderings in phrase translation. The method involves automatically extracting instances of structural divergences from sentence pairs, and automatically learning lexicalized grammatical rules probabilistically encoded with bilingual word order relations. At run-time, source sentences are reordered by applying the rules prior to phrase-based machine translation systems. Experiments show that our method cleanly captures systematic similarities and differences in languages' grammars, resulting in substantial improvement over state-of-the-art phrase based translation systems.
展开▼