首页> 外文OA文献 >Statistical inference and data mining: false discoveries control
【2h】

Statistical inference and data mining: false discoveries control

机译:统计推断和数据挖掘:错误发现控制

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Data Mining is characterised by its ability at processing large amounts of data. Among those are the data ”features”- variables or association rules that can be derived from them. Selecting the most interesting features is a classical data mining problem. That selection requires a large number of tests from which arise a number of false discoveries. An original non parametric control method is proposed in this paper. A new criterion, UAFWER, defined as the risk of exceeding a pre-set number of false discoveries, is controlled by BS FD, a bootstrap based algorithm that can be used on one- or two-sided problems. The usefulness of the procedure is illustrated by the selection of differentially interesting association rules on genetic data.
机译:数据挖掘的特征在于其处理大量数据的能力。其中包括数据“特征”-可以从中得出的变量或关联规则。选择最有趣的功能是经典的数据挖掘问题。这种选择需要进行大量的测试,由此产生许多错误的发现。本文提出了一种原始的非参数控制方法。新标准UAFWER被定义为超过预设数量的错误发现的风险,由BS FD控制,BS FD是一种基于引导程序的算法,可用于单面或双面问题。通过选择遗传数据上不同的有趣关联规则可以说明该程序的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号