首页> 外文会议>International Conference on Advances in Natural Language Processing(NLP, FinTAL2006); 20060823-25; Turku(FI) >Towards the Improvement of Statistical Translation Models Using Linguistic Features
【24h】

Towards the Improvement of Statistical Translation Models Using Linguistic Features

机译:利用语言特征改进统计翻译模型

获取原文
获取原文并翻译 | 示例

摘要

Statistical translation models can be inferred from bilingual samples whenever enough training data are available. However, bilingual corpora are usually too scarce resources so as to get reliable statistical models, particularly, when we are dealing with very inflected languages, or with agglutinative languages, where many words appear just once. Such events often distort the statistics. In order to cope with this problem, we have turned to morphological knowledge. Instead of dealing directly with running words, we also take advantage of lemmas, thus, producing the translation in two stages. In the first stage we transform the source sentence into a lemmatized target sentence, and in the second stage we convert the lemmatized target sentence into the target full forms.
机译:只要有足够的训练数据,就可以从双语样本中推断出统计翻译模型。但是,双语语料库通常太稀缺,以至于无法获得可靠的统计模型,特别是当我们处理的语言非常难懂或使用凝集性语言(其中很多单词仅出现一次)时。这样的事件经常使统计数字失真。为了解决这个问题,我们转向了形态学知识。除了直接处理运行中的单词,我们还利用引理,从而分两个阶段进行翻译。在第一阶段中,我们将源句转换为去词素化的目标句子,在第二阶段中,我们将去词质化的目标句转换为目标完整形式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号