首页> 外文期刊>Knowledge-Based Systems >EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses
【24h】

EDS-MEMBED: Multi-sense embeddings based on enhanced distributional semantic structures via a graph walk over word senses

机译:EDS-MEMBED:基于增强的分布语义结构的多感测嵌入式通过图形漫游在Word Senses上行走

获取原文
获取原文并翻译 | 示例
           

摘要

Several language applications often require word semantics as a core part of their processing pipeline either as precise meaning inference or semantic similarity. Multi-sense embeddings (m-se) can be exploited for this important requirement. m-se seeks to represent each word by their distinct senses in order to resolve the conflation of meanings of words as used in different contexts. Previous works usually approach this task by training a model on a large corpus and often ignore the effect and usefulness of the semantic relations offered by lexical resources. However, even with large training data, coverage of all possible word senses is still an issue. In addition, a considerable percentage of contextual semantic knowledge are never learned because a huge amount of possible distributional semantic structures are never explored. In this paper, we leverage the rich semantic structures in WordNet using a graph-theoretic walk technique over word senses to enhance the quality of multisense embeddings. This algorithm composes enriched texts from the original texts. Furthermore, we derive new distributional semantic similarity measures for m-se from prior ones. We adapt these measures to word sense disambiguation (wsd) aspect of our experiment. We report evaluation results on 11 benchmark datasets involving wsd and Word Similarity tasks and show that our method for enhancing distributional semantic structures improves embeddings quality on the baselines. Despite the small training data, it achieves state-of-the-art performance on some of the datasets. (c) 2021 Elsevier B.V. All rights reserved.
机译:多种语言应用程序通常需要Word语义作为其处理流水线的核心部分,也可以作为精确意义推理或语义相似性。可以利用多感觉嵌入式(M-SE)以获得这一重要要求。 M-SE试图通过他们独特的感官来表示每个单词,以便解决不同上下文中使用的单词含义的混合。以前的作品通常通过在大型语料库上培训模型来接近这项任务,并且通常忽略词汇资源所提供的语义关系的效果和实用性。但是,即使具有大的培训数据,也覆盖了所有可能的词感觉仍然是一个问题。此外,从未学习过大量的上下文语义知识,因为从未探索过大量可能的分布语义结构。在本文中,我们利用Word-理论展开技术在Wordnet中利用Wordnet中的丰富语义结构,以提高多语嵌入的质量。此算法从原始文本中撰写丰富的文本。此外,我们从先前的M-SE推出了新的分布语义相似度措施。我们将这些措施适应了我们实验的言论歧义(WSD)方面。我们报告评估结果涉及WSD和Word相似性任务的11个基准数据集,并表明我们的增强分配语义结构的方法可提高基线上的嵌入质量。尽管培训数据小,但它在一些数据集中实现了最先进的性能。 (c)2021 elestvier b.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号