首页> 外文期刊>Information Sciences: An International Journal >A deep dive into user display names across social networks
【24h】

A deep dive into user display names across social networks

机译:深入潜入用户跨社交网络的用户显示名称

获取原文
获取原文并翻译 | 示例
           

摘要

The display names from an individual across Online Social Networks (OSNs) always contain abundant information redundancies because most users tend to use one main name or similar names across OSNs to make them easier to remember or to build their online reputation. These information redundancies are of great benefit to information fusion across OSNs. In this paper, we aim to measure these information redundancies between different display names of the same individual. Based on the cross-site linking function of Foursquare, we first develop a distributed crawler to extract the display names that individuals used in Facebook, Twitter and Foursquare, respectively. We construct three display name datasets across three OSNs, and measure the information redundancies in three ways: length similarity, character similarity and letter distribution similarity. We also analyze the evolution of redundant information over time. Finally, we apply the measurement results to the user identification across OSNs. We find that (1) more than 45% of users tend to use the same display name across OSNs; (2) the display names of the same individual for different OSNs show high similarity; (3) the information redundancies of display names are time-independent; (4) the AUC values of user identification results only based on display names are more than 0.9 on three datasets. (C) 2018 Elsevier Inc. All rights reserved.
机译:来自在线社交网络(OSNS)的个人的显示名称始终包含丰富的信息冗余,因为大多数用户倾向于在OSNS上使用一个主名称或类似的名称,使其更容易记住或建立他们的在线声誉。这些信息冗余对OSNS的信息融合有很大的好处。在本文中,我们的目标是在同一个人的不同显示名称之间衡量这些信息冗余。基于Foursquare的跨站点链接功能,我们首先开发分布式爬虫,以分别提取Facebook,Twitter和Foursquare中使用的显示名称。我们通过三种方式构建三个显示名称数据集,并以三种方式测量信息冗余:长度相似性,字符相似性和信件分布相似度。我们还随着时间的推移分析了冗余信息的演变。最后,我们将测量结果应用于OSNS的用户识别。我们发现(1)超过45%的用户倾向于在OSNS上使用相同的显示名称; (2)不同OSN的同一个人的显示名称显示出高相似性; (3)显示名称的信息冗余是时间无关的; (4)三个数据集仅基于显示名称的用户识别结果的AUC值超过0.9。 (c)2018年Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号