首页> 外文期刊>Knowledge-Based Systems >Words are important: A textual content based identity resolution scheme across multiple online social networks
【24h】

Words are important: A textual content based identity resolution scheme across multiple online social networks

机译:单词很重要:跨多个在线社交网络的基于文本内容的身份解析方案

获取原文
获取原文并翻译 | 示例
           

摘要

Identity resolution of a person using various online social networks can enable an interested party to have a better and holistic understanding of former's behavior and personality. Major challenges in developing a reliable and scalable matching scheme for online identities include non-availability of required information or having contradictory information for the same user across these networks. In this study, we present a scheme for identity matching which utilizes important features extracted from contents generated by or shared with users across one's online social networks. With the help of natural language processing and text mining techniques, we extract and process parts-of-speech, symbols, emoticons, numbers, and high frequency words in user's posts, tweets, retweets, and URLs. On the basis of experiments with ground truth Twitter-Facebook real datasets, this method achieved 91.2 percent accuracy in matching user's identity across the user's profiles. The main contribution of this paper is that this proposes a novel method for identity matching, which utilizes only the publicly available content information of online social network users. This method can be used alone for identity matching, or can be used along with other identity resolution frameworks to enhance their accuracy. (C) 2020 Elsevier B.V. All rights reserved.
机译:使用各种在线社交网络的人的身份解决方案可以使有兴趣的党能够更好地了解前者的行为和个性。开发用于在线身份的可靠和可扩展的匹配方案的主要挑战包括在这些网络中具有所需信息的非可用信息或具有相同用户的矛盾信息。在这项研究中,我们提出了一种身份匹配方案,其利用由在一个在线社交网络中与用户共享的内容提取的重要特征。在自然语言处理和文本挖掘技术的帮助下,我们在用户的帖子,推文,转发和URL中提取和处理语音,符号,表情符号,数字和高频词。在实验的基础上用地面真理推特 - Facebook实时数据集,这种方法在匹配用户的配置文件方面匹配了91.2%的准确性。本文的主要贡献是,这提出了一种用于身份匹配的新方法,其仅利用在线社交网络用户的公开内容信息。该方法可以单独使用用于标识匹配,或者可以与其他身份分辨率框架一起使用以增强其精度。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号