首页> 外国专利> Methods for using manual phrase alignment data to generate translation models for statistical machine translation

Methods for using manual phrase alignment data to generate translation models for statistical machine translation

机译：使用手动短语对齐数据生成用于统计机器翻译的翻译模型的方法

页面导航

摘要
著录项
相似文献

摘要

The present invention adopts the fundamental architecture of a statistical machine translation system which utilizes statistical models learned from the training data and does not require expert knowledge for rule-based machine translation systems. Out of the training parallel data, a certain amount of sentence pairs are selected for manual alignment. These sentences are aligned at the phrase level instead of at the word level. Depending on the size of the training data, the optimal amount for manual alignment may vary. The alignment is done using an alignment tool with a graphical user interface which is convenient and intuitive to the users. Manually aligned data are then utilized to improve the automatic word alignment component. Model combination methods are also introduced to improve the accuracy and the coverage of statistical models for the task of statistical machine translation.

机译：本发明采用统计机器翻译系统的基本架构，该系统利用从训练数据中获悉的统计模型，并且对于基于规则的机器翻译系统不需要专家知识。从训练并行数据中，选择一定数量的句子对以进行手动对齐。这些句子在短语级别而不是单词级别对齐。根据训练数据的大小，手动对齐的最佳数量可能会有所不同。使用具有图形用户界面的对准工具来完成对准，该图形用户界面对用户而言方便而直观。然后利用手动对齐的数据来改进自动单词对齐组件。还引入了模型组合方法来提高统计模型的准确性和覆盖范围，以完成统计机器翻译的任务。

著录项

公开/公告号US8229728B2

专利类型
公开/公告日2012-07-24

原文格式PDF
申请/专利权人 JUN HUANG;YOOKYUNG KIM;DEMITRIOS MASTER;FARZAD EHSANI;
展开▼

申请/专利号US20080969518
发明设计人 JUN HUANG;YOOKYUNG KIM;DEMITRIOS MASTER;FARZAD EHSANI;
展开▼

申请日2008-01-04
分类号G06F17/20;G06F17/21;G06F17/28;
国家 US
入库时间 2022-08-21 17:29:00

相似文献

专利
外文文献
中文文献