首页> 外文会议>9th International conference on language resources and evaluation >Word Semantic Similarity for Morphologically Rich Languages
【24h】

Word Semantic Similarity for Morphologically Rich Languages

机译:形态丰富的语言的单词语义相似性

获取原文

摘要

In this work, we investigate the role of morphology on the performance of semantic similarity for morphologically rich languages, such as German and Greek. The challenge in processing languages with richer morphology than English, lies in reducing estimation error while addressing the semantic distortion introduced by a stemmer or a lemmatiser. For this purpose, we propose a methodology for selective stemming, based on a semantic distortion metric. The proposed algorithm is tested on the task of similarity estimation between words using two types of corpus-based similarity metrics: co-occurrence-based and context-based. The performance on morphologically rich languages is boosted by stemming with the context-based metric, unlike English, where the best results are obtained by the co-occurrence-based metric. A key finding is that the estimation error reduction is different when a word is used as a feature, rather than when it is used as a target word.
机译:在这项工作中,我们研究了形态学在形态丰富的语言(例如德语和希腊语)的语义相似性表现上的作用。处理语言形态比英语丰富的语言时,面临的挑战在于减少估计误差,同时解决词干或词干分离器引入的语义失真。为此,我们提出了一种基于语义失真度量的选择性词干提取方法。使用两种基于语料库的相似性度量:基于共现和基于上下文,在词之间的相似性估计任务上对提出的算法进行了测试。与英语不同,通过使用基于上下文的度量可以提高形态丰富的语言的性能,而英语基于基于同现的度量可以获得最佳结果。一个关键发现是,将单词用作特征时,而不是将其用作目标单词时,估计误差的减少是不同的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号