首页> 外文会议>Workshop on Cognitive Aspects of the Lexicon >Contextualized Word Embeddings Encode Aspects of Human-Like Word Sense Knowledge
【24h】

Contextualized Word Embeddings Encode Aspects of Human-Like Word Sense Knowledge

机译:上下文化的单词嵌入式编码人类词语感觉知识的方面

获取原文

摘要

Understanding context-dependent variation in word meanings is a key aspect of human language comprehension supported by the lexicon. Lexicographic resources (e.g., WordNet) capture only some of this context-dependent variation; for example, they often do not encode how closely senses, or discretized word meanings, are related to one another. Our work investigates whether recent advances in NLP, specifically contextualized word embeddings, capture human-like distinctions between English word senses, such as polysemy and homonymy. We collect data from a behavioral, web-based experiment, in which participants provide judgments of the relatedness of multiple WordNet senses of a word in a two-dimensional spatial arrangement task. We find that participants' judgments of the relatedness between senses are correlated with distances between senses in the BERT embedding space. Homonymous senses (e.g., bat as mammal vs. bat as sports equipment) are reliably more distant from one another in the embedding space than polysemous ones (e.g., chicken as animal vs. chicken as meat). Our findings point towards the potential utility of continuous-space representations of sense meanings.
机译:了解词语含义的相关变化是词汇支持的人类语言理解的关键方面。词典资源(例如,Wordnet)仅捕获这些相关的一些依赖变化;例如,它们通常不会编码感官或离散化词含义的程度彼此相关。我们的工作调查了NLP中最近的进步,特别是语境化的单词嵌入,捕获英语单词感官之间的人类区别,例如多义和同性义。我们从行为,基于Web的实验中收集数据,其中参与者在二维空间排列任务中提供了多个Wordnet感官的相关性的判断。我们发现参与者对感官之间的相关性的判断与BERT嵌入空间中的感官之间的距离相关。同名感官(例如,作为哺乳动物与体育设备的哺乳动物Vs.蝙蝠)在嵌入空间中可靠地彼此远离彼此(例如,作为动物与鸡作为鸡肉)。我们的发现指出了持续空间表示的潜在效用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号