首页> 外文会议>Second conference on machine translation >Sense-Aware Statistical Machine Translation using Adaptive Context-Dependent Clustering
【24h】

Sense-Aware Statistical Machine Translation using Adaptive Context-Dependent Clustering

机译:使用自适应上下文相关聚类的感知感知统计机器翻译

获取原文
获取原文并翻译 | 示例

摘要

Statistical machine translation (SMT) systems use local cues from n-gram translation and language models to select the translation of each source word. Such systems do not explicitly perform word sense disambiguation (WSD), although this would enable them to select translations depending on the hypothesized sense of each word. Previous attempts to constrain word translations based on the results of generic WSD systems have suffered from their limited accuracy. We demonstrate that WSD systems can be adapted to help SMT, thanks to three key achievements: (1) we consider a larger context for WSD than SMT can afford to consider; (2) we adapt the number of senses per word to the ones observed in the training data using clustering-based WSD with K-means; and (3) we initialize sense-clustering with definitions or examples extracted from WordNet. Our WSD system is competitive, and in combination with a factored SMT system improves noun and verb translation from English to Chinese, Dutch, French, German, and Spanish.
机译:统计机器翻译(SMT)系统使用来自n-gram翻译和语言模型的本地提示来选择每个源单词的翻译。此类系统不会明确执行单词歧义消除(WSD),尽管这将使它们能够根据每个单词的假设含义选择翻译。先前基于通用WSD系统的结果来限制单词翻译的尝试受到了其有限的准确性的困扰。我们证明,由于以下三项关键成就,WSD系统可以适应SMT的需求:(1)我们认为WSD的范围比SMT所能考虑的要大; (2)我们使用基于聚类的WSD和K-means使每个单词的感官数量适应训练数据中观察到的感官数量; (3)我们使用从WordNet中提取的定义或示例来初始化感知聚类。我们的WSD系统具有竞争力,并与因子SMT系统相结合,可改善名词和动词从英语到中文,荷兰语,法语,德语和西班牙语的翻译。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号