We introduce (1) a novel stochavic inversion trans diction grammar formalism for bilingual language modeling of sentence-pairs and (2) the concept of bilingual parsing with potential application to a variety of parallel corpus analysis problems. The formalism combines three tactics against the con straints that render finite-state transducers less useful it skips directly to a context-free rather than finite-state base it permits a minimal extra degree of ordering flexibility and its probabilistic formulation admits an efficient maximum-likelihood bilingual parsing algorithm. A convenient normal form is shown to exist and we discuss a number of exam pies at how stochastic inversion transduction grammars bring bilingual constraints to bear upon problematic corpus analysis tasks.
展开▼