首页> 外文会议>2017 International Conference on Information, Communication, Instrumentation and Control >Social media data sensitivity and privacy scanning an experimental analysis with hadoop
【24h】

Social media data sensitivity and privacy scanning an experimental analysis with hadoop

机译:社交媒体数据敏感度和隐私扫描使用hadoop进行实验分析

获取原文
获取原文并翻译 | 示例

摘要

Now in these days the social network has becomes a daily habit for all. Most of the young and teenager are consuming their time on social media. Due to frequent reachability of users the different marketing companies are also usages this platform for publishing advertises. But not only legitimate users are available in this platform, sometimes this platform is also used for abusing someone or harshen someone. Therefore, it is need to identify the sensitive contents on the social media platforms before publishing the contents. A number of different kinds of approaches are available for scanning the contents, but all these techniques are much time-consuming. Therefore, these techniques are not directly used with the social networks. In order to find an efficient technique an effort is presented in this work. The proposed technique is an enhancement over the traditional finger print scan method for sensitive content evaluation. The proposed technique incorporates the NLP (natural language processing) parsers for identifying the sensitive features. The sensitive features are considered here as the noun words in any twit, because in most of the cases the identity of person or places are used for misguiding the social network users. Additionally, in place of linear search technique, a random index scan method is introduced for improving the time consumption of the traditional approaches. Because this technique produces the results equal as the linear search in worst case. The proposed technique is evaluated over the twitter data using the Hadoop, Strom and twitter API implementation. After the successfully implementation the technique is compared with the traditional available technique over the time and space complexity. The experimental results show the performance in terms of time requirement is three times efficient than the traditional approach of sensitivity scan.
机译:如今,社交网络已成为所有人的日常习惯。大多数年轻人和青少年都在社交媒体上花费时间。由于用户的频繁访问,不同的营销公司也使用该平台来发布广告。但是,不仅合法用户可以在该平台上使用,有时该平台还用于虐待或粗暴对待某人。因此,需要在发布社交媒体平台之前识别敏感内容。可以使用多种不同的方法来扫描内容,但是所有这些技术都非常耗时。因此,这些技术不能直接用于社交网络。为了找到一种有效的技术,在这项工作中付出了努力。所提出的技术是对用于敏感内容评估的传统指纹扫描方法的增强。所提出的技术结合了NLP(自然语言处理)解析器来识别敏感特征。敏感特征在任何情况下都被视为名词,因为在大多数情况下,使用个人或地点的身份来误导社交网络用户。另外,代替线性搜索技术,引入了随机索引扫描方法以改善传统方法的时间消耗。因为这种技术在最坏的情况下产生的结果与线性搜索相等。使用Hadoop,Strom和twitter API实现对Twitter数据进行评估。成功实施后,在时间和空间复杂度上将该技术与传统可用技术进行比较。实验结果表明,在时间要求方面的性能是传统灵敏度扫描方法的三倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号