首页> 外文期刊>Frontiers of computer science in China >TPII: tracking personally identifiable information via user behaviors in HTTP traffic
【24h】

TPII: tracking personally identifiable information via user behaviors in HTTP traffic

机译:TPII:通过HTTP流量中的用户行为跟踪个人身份信息

获取原文
获取原文并翻译 | 示例
           

摘要

It is widely common that mobile applications collect non-critical personally identifiable information (PII) from users' devices to the cloud by application service providers (ASPs) in a positive manner to provide precise and recommending services. Meanwhile, Internet service providers (ISPs) or local network providers also have strong requirements to collect PIIs for finer-grained traffic control and security services. However, it is a challenge to locate PIIs accurately in the massive data of network traffic just like looking a needle in a haystack. In this paper, we address this challenge by presenting an efficient and light-weight approach, namely TPII, which can locate and track PIIs from the HTTP layer rebuilt from raw network traffics. This approach only collects three features from HTTP fields as users' behaviors and then establishes a tree-based decision model to dig PIIs efficiently and accurately. Without any priori knowledge, TPII can identify any types of PIIs from any mobile applications, which has a broad vision of applications. We evaluate the proposed approach on a real dataset collected from a campus network with more than 13k users. The experimental results show that the precision and recall of TPII are 91.72% and 94.51% respectively and a parallel implementation of TPII can achieve 213 million records digging and labelling within one hour, reaching near to support 1Gbps wire-speed inspection in practice. Our approach provides network service providers a practical way to collect PIIs for better services.
机译:移动应用程序以积极的方式通过应用程序服务提供商(ASP)从用户的设备收集非关键的个人身份信息(PII)到云,以提供精确的推荐服务,这是普遍现象。同时,Internet服务提供商(ISP)或本地网络提供商也强烈要求收集PII,以提供更细粒度的流量控制和安全服务。但是,将PII准确定位在海量网络流量中是一个挑战,就像大海捞针一样。在本文中,我们通过提出一种有效且轻量级的方法TPII来应对这一挑战,该方法可以从原始网络流量重建的HTTP层中定位和跟踪PII。这种方法仅从HTTP字段中收集三个特征作为用户的行为,然后建立基于树的决策模型来高效,准确地挖掘PII。无需任何先验知识,TPII可以从任何具有广泛应用前景的移动应用程序中识别出任何类型的PII。我们在从超过13000个用户的校园网络中收集的真实数据集上评估了该方法。实验结果表明,TPII的准确率和查全率分别为91.72%和94.51%,并行实施TPII可以在一小时内完成2.13亿条记录的挖掘和标记,实际上已接近支持1Gbps线速检查。我们的方法为网络服务提供商提供了一种收集PII以获得更好服务的实用方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号