首页> 外文会议>Mexican Conference on Pattern Recognition >A Naive Bayes Approach to Cross-Lingual Word Sense Disambiguation and Lexical Substitution
【24h】

A Naive Bayes Approach to Cross-Lingual Word Sense Disambiguation and Lexical Substitution

机译:一个天真的贝母探讨横向词语感歧美歧义和词汇替代

获取原文

摘要

Word Sense Disambiguation (WSD) is considered one of the most important problems in Natural Language Processing [1]. It is claimed that WSD is essential for those applications that require of language comprehension modules such as search engines, machine translation systems, automatic answer machines, second life agents, etc. Moreover, with the huge amounts of information in Internet and the fact that this information is continuously growing in different languages, we are encourage to deal with cross-lingual scenarios where WSD systems are also needed. On the other hand, Lexical Substitution (LS) refers to the process of finding a substitute word for a source word in a given sentence. The LS task needs to be approached by firstly disambiguating the source word, therefore, these two tasks (WSD and LS) are somehow related. In this paper, we present a naive approach to tackle the problem of cross-lingual WSD and cross-lingual lexical substitution. We use a bilingual statistical dictionary, which is calculated with Giza++ by using the EUROPARL parallel corpus, in order to calculate the probability of a source word to be translated to a target word (which is assumed to be the correct sense of the source word but in a different language). Two versions of the probabilistic model are tested: unweighted and weighted. The results were compared with those of an international competition, obtaining a good performance.
机译:词语感复歧义(WSD)被认为是自然语言处理中最重要的问题之一[1]。据称,WSD对于那些需要的语言理解模块(如搜索引擎,机器翻译系统,自动应答机,第二寿命等)而言是必不可少的。此外,在互联网上具有大量信息以及这一事实信息不断增长不同语言,我们鼓励处理也需要WSD系统的交叉语言情景。另一方面,词汇替换(LS)是指在给定句子中找到源单词的替代词的过程。通过首先消除源单词,需要接近LS任务,因此,这两个任务(WSD和LS)是以某种方式相关的。在本文中,我们提出了一种天真的方法来解决交叉WSD和交叉词汇替代的问题。我们使用双语统计词典,通过使用europarl并行语料库来计算用Giza ++计算,以便计算要转换为目标字的源单词的概率(假设是源单词的正确意义用不同的语言)。测试概率模型的两个版本:未加权和加权。结果与国际竞争的结果进行了比较,获得了良好的表现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号