首页> 外文期刊>Applied Network Science >Graph-based exploration and clustering analysis of semantic spaces
【24h】

Graph-based exploration and clustering analysis of semantic spaces

机译:基于图的语义空间探索和聚类分析

获取原文
       

摘要

Abstract The goal of this study is to demonstrate how network science and graph theory tools and concepts can be effectively used for exploring and comparing semantic spaces of word embeddings and lexical databases. Specifically, we construct semantic networks based on word2vec representation of words, which is “learnt” from large text corpora (Google news, Amazon reviews), and “human built” word networks derived from the well-known lexical databases: WordNet and Moby Thesaurus. We compare “global” (e.g., degrees, distances, clustering coefficients) and “local” (e.g., most central nodes and community-type dense clusters) characteristics of considered networks. Our observations suggest that human built networks possess more intuitive global connectivity patterns, whereas local characteristics (in particular, dense clusters) of the machine built networks provide much richer information on the contextual usage and perceived meanings of words, which reveals interesting structural differences between human built and machine built semantic networks. To our knowledge, this is the first study that uses graph theory and network science in the considered context; therefore, we also provide interesting examples and discuss potential research directions that may motivate further research on the synthesis of lexicographic and machine learning based tools and lead to new insights in this area.
机译:摘要这项研究的目的是演示如何有效地使用网络科学和图论工具和概念来探索和比较单词嵌入和词法数据库的语义空间。具体来说,我们基于word2vec单词表示构建语义网络,这是从大型文本语料库(谷歌新闻,亚马逊评论)“学习”的,以及从著名词汇数据库WordNet和Moby Thesaurus派生的“人为”单词网络。我们比较了所考虑网络的“全局”(例如度,距离,聚类系数)和“局部”(例如大多数中央节点和社区类型的密集簇)特征。我们的观察结果表明,人为构建的网络拥有更直观的全局连接模式,而机器构建的网络的局部特征(尤其是密集的簇)提供了有关词语的上下文用法和感知含义的丰富得多的信息,这揭示了人与人之间有趣的结构差异建立和机器构建的语义网络。据我们所知,这是第一个在考虑的上下文中使用图论和网络科学的研究。因此,我们还提供了有趣的示例并讨论了潜在的研究方向,这些方向可能会鼓励对基于词典学和机器学习的工具进行综合研究,并在该领域获得新见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号