首页> 外文会议>IEEE International Conference on Advanced Information Networking and Applications >Applicability of Probablistic Data Structures for Filtering Tasks in Data Loss Prevention Systems
【24h】

Applicability of Probablistic Data Structures for Filtering Tasks in Data Loss Prevention Systems

机译:概率数据结构在数据丢失预防系统中过滤任务的适用性

获取原文

摘要

The paper studies the applicability of a probabilistic data structure known as Bloom Filter (BF) in the content analysis component of Data Loss Prevention (DLP) Systems. The study shows that Bus may serve as preliminary selection mechanism in content analysis. The goal of such mechanism is to quickly pre-select documents that may be similar to the one being checked. This selection should be accompanied by more detailed comparison to cope with false positive results produced by BFs. Specialized form of the filter called Matrix BF has been found particularly helpful for the content analysis task as it provides search localization and allows the filter to grow along with the document database and maintain liner search time. The paper outlined theoretical threshold for false positives for comparison of two rows in the Matrix BF. The threshold was confirmed by experiments. The experiments also indicated acceptable performance in terms of computational performance and level of false positives. Tests with obfuscated texts revealed some limitations of the proposed approach.
机译:本文研究了一种称为Bloom Bloom(BF)的概率数据结构在Data Loss Prevention(DLP)系统的内容分析组件中的适用性。研究表明,总线可以作为内容分析中的初步选择机制。这种机制的目标是快速预选可能与要检查的文件相似的文件。该选择应伴有更详细的比较,以应对高炉产生的假阳性结果。已经发现称为Matrix BF的特殊形式的过滤器对内容分析任务特别有用,因为它提供了搜索本地化,并允许过滤器与文档数据库一起增长并保持线性搜索时间。本文概述了误报率的理论阈值,用于比较矩阵BF中的两行。该阈值通过实验确认。实验还表明在计算性能和误报水平方面可接受的性能。对混淆文本的测试表明了该方法的局限性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号