首页> 外文期刊>International journal of speech technology >Word sense disambiguation for Arabic text using Wikipedia and Vector Space Model
【24h】

Word sense disambiguation for Arabic text using Wikipedia and Vector Space Model

机译:使用维基百科和向量空间模型对阿拉伯文本进行词义消歧

获取原文
获取原文并翻译 | 示例
           

摘要

In this research we introduce a new approach for Arabic word sense disambiguation by utilizing Wikipedia as a lexical resource for disambiguation. The nearest sense for an ambiguous word is selected using Vector Space Model as a representation and cosine similarity between the word context and the retrieved senses from Wikipedia as a measure. Three experiments have been conducted to evaluate the proposed approach, two experiments use the first retrieved sentence for each sense from Wikipedia but they use different Vector Space Model representations while the third experiment uses the first paragraph for the retrieved sense from Wikipedia. The experiments show that using first paragraph is better than the first sentence and the use of TF-IDF is better than using abstract frequency in VSM. Also, the proposed approach is tested on English words and it gives better results using the first sentence retrieved from Wikipedia for each sense.
机译:在这项研究中,我们通过利用Wikipedia作为消除歧义的词汇资源,介绍了一种解决阿拉伯语单词歧义的新方法。使用向量空间模型作为表示选择歧义词的最接近意义,并将词的上下文与从Wikipedia检索到的意义之间的余弦相似度作为度量。已经进行了三个实验来评估所提出的方法,两个实验对来自Wikipedia的每种感觉使用第一个检索到的句子,但是他们使用不同的向量空间模型表示,而第三个实验对Wikipedia的检索到的意义使用了第一段。实验表明,在VSM中,使用第一段优于使用第一句,使用TF-IDF优于使用抽象频率。同样,该提议的方法在英语单词上进行了测试,并且使用从Wikipedia检索到的每种意义的第一句话,都能提供更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号