An Unsupervised Method for Linking Entity Mentions in Chinese Text

机译：链接中文文本中实体提及的无监督方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Entity linking is the process of linking entity mentions in text with the unambiguous entity objects in a knowledge base. The technology is a key step of expanding a knowledge base, and can improve the information filtering ability of online recommendation systems, search engines, and other practical applications. However, the large number of entities, the diversity and ambiguity of entity names bring huge challenges for entity Unking research. In addition, the rare Chinese knowledge bases and the complex syntax of Chinese text restrict researching Chinese entity linking technologies. In order to meet the processing requirement of Chinese text, we propose an unsupervised Chinese entity linking method, namely un-CEML. This method uses Baidu encyclopedia as a knowledge base, exploits a similarity algorithm to obtain entries from Baidu encyclopedia, and combines the characteristics of this encyclopedia to obtain candidate entities, which can handle the abbreviation and wrongly segmenting entity mentions, ensuring the size of candidate entities and the probability of containing the target entity. In the ranking stage of candidate entities, we obtain the strongly relevant information of entity mentions based on the dependencies of components in a sentence as the context information, to reduce the noise of calculating the similarity with candidate entities. Because the nominal mentions are mostly common words, small correlation with the document knowledge, we deal with them separately. We conduct experiments on real data sets, and compare with some standard methods. The experimental results show that our method can solve the ambiguity problem of Chinese entity mentions, and achieve high accuracy of linking results.

机译：实体链接是将文本中的实体提及与知识库中明确的实体对象链接的过程。该技术是扩展知识库的关键步骤，可以提高在线推荐系统，搜索引擎和其他实际应用程序的信息过滤能力。但是，实体数量众多，实体名称的多样性和含糊性给实体Unking研究带来了巨大挑战。另外，中文基础知识的稀缺和中文文本语法的复杂性限制了中文实体链接技术的研究。为了满足中文文本的处理需求，我们提出了一种无监督的中文实体链接方法，即un-CEML。该方法以百度百科为知识库，利用相似性算法从百度百科中获取条目，并结合该百科的特征获取候选实体，可以处理缩写和错误地分割实体提及，确保候选实体的大小以及包含目标实体的概率。在候选实体的排序阶段，我们基于句子中各组成部分的依存关系获取实体提及的高度相关信息作为上下文信息，以减少计算与候选实体相似度的噪声。因为名词性提法通常是常用词，与文档知识的相关性很小，所以我们将它们分开处理。我们在真实数据集上进行实验，并与一些标准方法进行比较。实验结果表明，该方法可以解决中文实体提及中的歧义问题，并实现了高精度的链接结果。

著录项

来源
《Asia-Pacific services computing conference》|2016年|183-195|共13页
会议地点 Zhangjiajie(CN)
作者
Jing Xu; Liang Gan; Bin Zhou; Quanyuan Wu;
展开▼
作者单位

College of Computer National University of Defense Technology Changsha 410073 China;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Entity linking; Baidu encyclopedia; Information extraction; Unsupervised; Chinese text;

机译：实体链接；百度百科；信息提取；无监督;中文文字;

相似文献

外文文献
中文文献
专利

1. Unsupervised Medical Entity Recognition and Linking in Chinese Online Medical Text [J] . Xu Jing, Gan Liang, Cheng Mian, Journal of healthcare engineering. . 2018,第Pta2期

机译：无监督的医疗实体识别和链接中文在线医学文本
2. A Boundary Assembling Method for Chinese Entity-Mention Recognition [J] . Chen Yanping, Zheng Qinghua, Chen Ping Intelligent Systems, IEEE . 2015,第6期

机译：一种中文实体-心态识别的边界组装方法
3. Automatic Recognition of Chemical Entity Mentions in Texts of Scientific Publications [J] . Biziukova N. Yu, Tarasova O. A., Rudik A. V, Automatic Documentation and Mathematical Linguistics . 2020,第6期

机译：在科学出版物文本中自动识别化学实体提到
4. An Unsupervised Method for Linking Entity Mentions in Chinese Text [C] . Jing Xu, Liang Gan, Bin Zhou, Asia-Paciﬁc Services Computing Conference . 2016

机译：一种无人监督的方法，用于在中文文本中链接实体提到
5. Identification of entity mentions in text and their coreference resolution. [D] . Nicolae, Cristina. 2006

机译：文本中提及的实体及其共同引用解决方案的标识。
6. Unsupervised Medical Entity Recognition and Linking in Chinese Online Medical Text [O] . Jing Xu, Liang Gan, Mian Cheng, 2018

机译：中文在线医学文本中的无监督医学实体识别与链接
7. Collective Entity Linking Method in Chinese Text Based on Topic Consistency [O] . Yi Chen, Qingbo Wu, Yusong Tan, 2017

机译：基于主题一致性的中文文本中的集体实体链接方法
8. Result Diversity and Entity Ranking Experiments: Anchors, Links, Text and Wikipedia [R] . Kaptein, R., Koolen, M., Kamps, J. 2009

机译：结果多样性和实体排名实验：锚点，链接，文本和维基百科

An Unsupervised Method for Linking Entity Mentions in Chinese Text

摘要

著录项

相似文献

相关主题

期刊订阅