首页> 外文会议>International Conference on Information and Knowledge Technology >ParsEL 1.0: Unsupervised Entity Linking in Persian Social Media Texts
【24h】

ParsEL 1.0: Unsupervised Entity Linking in Persian Social Media Texts

机译:PARSEL 1.0:在波斯社交媒体文本中链接的无监督实体

获取原文

摘要

Social media users have exponentially increased in recent years, and social media data has become one of the most populated repositories of data in the world. Natural language text is one of the main portions of this data. However, this textual data contains many entities, which increases the ambiguity of the data. Entity linking targets finding entity mentions and linking them to their corresponding entities in an external dataset. Recently, FarsBase has been introduced as the first Persian knowledge graph, containing almost 750,000 entities. In this study, we propose ParsEL, the first unsupervised end-to-end entity linking system specially designed for the Persian language, and utilizes contextual and graph-based features to rank the candidate entities. To evaluate the proposed approach, we publish the first entity linking dataset for the Persian language, created by crawling social media text from some popular Telegram channels and contains 67,595 tokens. The results show ParsEL records 86.94% f-score for the introduced dataset, and it is comparable with one other entity linking system which supports the Persian language.
机译:近年来,社交媒体用户呈指数级增长,社交媒体数据已成为世界上庞大的数据存储库之一。自然语言文本是此数据的主要部分之一。但是,此文本数据包含许多实体,这增加了数据的歧义。实体链接目标查找实体提及并将它们链接到外部数据集中的相应实体。最近,Farsbase已被引入为第一个波斯知识图表,包含近750,000个实体。在这项研究中,我们提出PARSEL,这是专门为波斯语设计的第一个无监督的端到端实体链接系统,并利用基于图形和基于图形的特征​​来对候选实体进行排名。为了评估所提出的方法,我们通过爬行来自一些流行的电报频道的社交媒体文本创建的波斯语,发布了Postian语言的第一个连接数据集,其中包含67,595令牌。结果显示PARSEL记录引入数据集的86.94%F分数,它与支持波斯语的另一个实体链接系统相媲美。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号