首页> 外文期刊>Neurocomputing >Content-aware attributed entity embedding for synonymous named entity discovery
【24h】

Content-aware attributed entity embedding for synonymous named entity discovery

机译:内容感知属性实体嵌入,用于同义词命名实体发现

获取原文
获取原文并翻译 | 示例

摘要

Synonymous Named Entity Discovery (SNED) refers to the task of discovering named entities that refer to the same entity. Discovering synonymous named entity by manually designing features and similarity metrics is non-trivial and very difficult due to the diversity of the raw features (e.g. the associated attributes and text content). In this paper, we present Content-Aware Attributed Entity Embedding (CAAEE), an unsupervised SNED model to address this issue. By leveraging the associated attributes and text content information, our approach learns a projection which maps named entities to a low-dimensional feature space without any manually designed feature and supervised information. In the learned feature space, synonymous named entities are close to each other, which can reflect the similarity between named entities. We build two heterogeneous networks to jointly model named entities, their associated attributes and text content information. For each heterogeneous network, we design two objective function based on two probability distributions aimed at preserving the network structure. By jointly optimizing the objective functions, a low-dimensional representation is obtained for each named entity. The similarity between the learned low-dimensional representations is then used to discover synonymous named entities. In experiments, we compare our model with existing SNED models on two real-world named entity datasets. Experimental results show that CAAEE outperforms state-of-the-art methods with significant improvement. (C) 2018 Elsevier B.V. All rights reserved.
机译:同义词命名实体发现(SNED)是指发现引用相同实体的命名实体的任务。由于原始特征(例如,相关联的属性和文本内容)的多样性,通过手动设计特征和相似性度量来发现同义词实体是不平凡的并且非常困难的。在本文中,我们提出了内容感知属性实体嵌入(CAAEE),这是一种无监督的SNED模型,可以解决此问题。通过利用关联的属性和文本内容信息,我们的方法学习了一种投影,该投影将命名实体映射到低维特征空间,而无需任何手动设计的特征和监督信息。在学习的特征空间中,同义的命名实体彼此接近,这可以反映命名实体之间的相似性。我们建立了两个异构网络,共同对命名实体,它们的关联属性和文本内容信息进行建模。对于每个异构网络,我们基于两个概率分布设计两个目标函数,旨在保留网络结构。通过共同优化目标函数,可以为每个命名实体获得一个低维表示。然后,将所学习的低维表示之间的相似性用于发现同义的命名实体。在实验中,我们在两个真实的命名实体数据集上将我们的模型与现有的SNED模型进行了比较。实验结果表明,CAAEE的性能明显优于最新技术。 (C)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号