In this work, we examine the quality ofseveral statistical machine translation systemsconstructed on a small amount ofparallel Serbian-English text. The mainbilingual parallel corpus consists of about3k sentences and 20k running words froman unrestricted domain. The translationsystems are built on the full corpus as wellas on a reduced corpus containing only200 parallel sentences. A small set ofabout 350 short phrases from the web isused as additional bilingual knowledge. Inaddition, we investigate the use of monolingualmorpho-syntactic knowledge I.e.Base forms and POS tags.
展开▼