首页> 外文会议>IEEE International Conference on Data Mining Workshops >Linking Personally Identifiable Information from the Dark Web to the Surface Web: A Deep Entity Resolution Approach

【24h】

Linking Personally Identifiable Information from the Dark Web to the Surface Web: A Deep Entity Resolution Approach

机译：将亲自可识别的信息从暗网上链接到表面网：深度实体解析方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The information privacy of the Internet users has become a major societal concern. The rapid growth of online services increases the risk of unauthorized access to Personally Identifiable Information (PII) of at-risk populations, who are unaware of their PII exposure. To proactively identify online at-risk populations and increase their privacy awareness, it is crucial to conduct a holistic privacy risk assessment across the internet. Current privacy risk assessment studies are limited to a single platform within either the surface web or the dark web. A comprehensive privacy risk assessment requires matching exposed PII on heterogeneous online platforms across the surface web and the dark web. However, due to the incompleteness and inaccuracy of PII records in each platform, linking the exposed PII to users is a non-trivial task. While Entity Resolution (ER) techniques can be used to facilitate this task, they often require ad-hoc, manual rule development and feature engineering. Recently, Deep Learning (DL)-based ER has outperformed manual entity matching rules by automatically extracting prominent features from incomplete or inaccurate records. In this study, we enhance the existing privacy risk assessment with a DL-based ER method, namely Multi-Context Attention (MCA), to comprehensively evaluate individuals' PII exposure across the different online platforms in the dark web and surface web. Evaluation against benchmark ER models indicates the efficacy of MCA. Using MCA on a random sample of data breach victims in the dark web, we are able to identify 4.3% of the victims on the surface web platforms and calculate their privacy risk scores.

机译：互联网用户的信息隐私已成为一个主要的社会问题。在线服务的快速增长增加了未经授权访问的危险人群的个人可识别信息（PII）的风险，他们不知道其PII曝光。要主动识别在线风险群体，并提高隐私意识，这对互联网进行全面隐私风险评估至关重要。目前的隐私风险评估研究仅限于地表网或暗网中的单个平台。全面的隐私风险评估需要将暴露的PII匹配在地表网和暗网上的异构在线平台上。但是，由于每个平台中PII记录的不完整性和不准确性，将暴露的PII与用户联系起来是一个非琐碎的任务。虽然实体分辨率（ER）技术可用于促进此任务，但它们通常需要ad-hoc，手动规则开发和功能工程。最近，基于深度学习（DL）的ER通过自动提取来自不完整或不准确的记录的突出特征来实现手动实体匹配规则。在这项研究中，我们通过基于DL的ER方法提升了现有的隐私风险评估，即多语境注意（MCA），以全面评估暗网和表面网的不同在线平台上的个人的PII曝光。对基准ER模型的评估表明了MCA的功效。在暗网络中使用MCA在Data Data Breacal受害者样本中，我们能够识别表面Web平台上的4.3％的受害者，并计算其隐私风险分数。

著录项

来源
《IEEE International Conference on Data Mining Workshops 》|2020年|488-495|共8页
会议地点
作者
Fangyu Lin; Yizhi Liu; Mohammadreza Ebrahimi; Zara Ahmad-Post; James Lee Hu; Jingyu Xin; Sagar Samtani; Weifeng Li; Hsinchun Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Privacy; Sociology; Manuals; Risk management; Task analysis; Statistics; Erbium;

机译：隐私;社会学;手册;风险管理;任务分析;统计;erbium;

相似文献

外文文献
中文文献
专利

1. Web of Data and Web of Entities: Identity and Reference in Interlinked Data in the Semantic Web [J] . Paolo Bouquet, Heiko Stoermer, Massimiliano Vignolo Knowledge Technology & Policy . 2012 ,第1期

机译：数据网和实体网：语义网中互连数据中的标识和引用
2. SMAPH: A Piggyback Approach for Entity-Linking in Web Queries [J] . Cornolti Marco, Ferragina Paolo, Ciaramita Massimiliano, ACM Transactions on Information Systems . 2019 ,第1期

机译：SMAPH：一种在Web查询中进行实体链接的Pi带方法
3. DEEP WEB, DARK WEB, INVISIBLE WEB AND THE POST ISIS WORLD [J] . Ryan Ehney, Jack Shorter Issues in Information Systems . 2016 ,第4期

机译：深网，暗网，隐形网和ISIS后世界
4. Identifying, Collecting, and Monitoring Personally Identifiable Information: From the Dark Web to the Surface Web [C] . Yizhi Liu, Fang Yu Lin, Zara Ahmad-Post, IEEE International Conference on Intelligence and Security Informatics . 2020

机译：识别，收集和监控个人身份信息：从黑暗网站到地面网
5. Dark Web: Problems Law Enforcement Investigations Face on the Dark Web [D] . Mickles, Jerry. 2017

机译：暗网：执法调查在暗网上面临的问题
6. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling [O] . R. Suganya Devi, D. Manjula, R. K. Siddharth 2015

机译：通过Web爬网中的超链接对大数据进行Web索引的一种有效方法
7. Deep web, dark web, dark net [O] . Masayuki Hatta 2020

机译：深纤维，暗网，暗网

Linking Personally Identifiable Information from the Dark Web to the Surface Web: A Deep Entity Resolution Approach

摘要

著录项

相似文献

相关主题

期刊订阅