首页> 外文会议>Annual IFIP WG 11.3 conference on data and applications security and privacy >Budget-Constrained Result Integrity Verification of Outsourced Data Mining Computations
【24h】

Budget-Constrained Result Integrity Verification of Outsourced Data Mining Computations

机译:预算约束的外包数据挖掘计算结果完整性验证

获取原文

摘要

When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.
机译:当需要将数据挖掘外包给“数据挖掘即服务”(DMaS)范式中的不受信任的服务提供商时,重要的是验证服务提供商(服务器)是否返回正确的挖掘结果(以数据挖掘对象的格式) )。我们考虑其中每个数据挖掘对象都与权重相关联的设置,以考虑其重要性。对于给定的验证预算有限的客户,服务器将选择总验证成本不超过给定预算的采矿结果子集,同时使选定结果的总权重最大化。这对应于众所周知的预算最大覆盖(BMC)问题,这是NP难题。因此,服务器可以执行启发式算法以选择挖掘结果的子集以进行验证。服务器有经济动机诱骗启发式输出,因此客户必须为验证不太重要的挖掘结果支付更多费用。我们的目的是验证服务器选择的挖掘结果确实满足预算的最大化要求。验证启发式算法的结果完整性是具有挑战性的,因为结果是不确定的。我们设计了一种概率验证方法,方法是包括否定候选对象(NC),这些候选对象被保证不包括在基于比率的BMC解决方案的预算最大化结果中。我们在现实世界的数据集上进行了广泛的实验,并表明基于NC的验证方法可以以较小的开销获得较高的保证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号