首页> 外文会议>SIAM International Conference on Data Mining >Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling
【24h】

Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling

机译:不完整U形统计数据的最大偏差与申请到经验风险抽样

获取原文

摘要

It is the goal of this paper to extend the Empirical Risk Minimization (ERM) paradigm, from a practical perspective, to the situation where a natural estimate of the risk is of the form of a K-sample U-statistics, as it is the case in the K-partite ranking problem for instance. Indeed, the numerical computation of the empirical risk is hardly feasible if not infeasible, even for moderate samples sizes. Precisely, it involves averaging O(n~(d1+...+dK)) terms, when considering a U-statistic of degrees (d_1,..., dK) based on samples of sizes proportional to n. We propose here to consider a drastically simpler Monte-Carlo version of the empirical risk based on O(n) terms solely, which can be viewed as an in- complete generalized U-statistic, and prove that, remarkably, the approximation stage does not damage the ERM procedure and yields a learning rate of order O_P(1/{the square root of}n). Beyond a theoretical analysis guaranteeing the validity of this approach, numerical experiments are displayed for illustrative purpose.
机译:本文的目标是从实际角度扩大经验风险最小化(ERM)范式,以对风险的自然估计是K样本U统计的形式,因此是例如K-Partite排名问题。实际上,如果不可行,即使对于适度的样本尺寸,实际风险的数值计算几乎不可行。准确地说,它涉及平均o(n〜(d1 + ... + dk)术语,当考虑基于与n的尺寸样本的测量值(d_1,...,dk)。我们在此提出了基于O(n)术语的经验风险的巨大简化的Monte-Carlo版本,其可以被视为完全的普遍性U形统计,并证明近似阶段没有损坏ERM程序并产生O_P的学习率O_P(1 / {n的平方根)。除了保证这种方法的有效性的理论分析之外,展示了数字实验以出于说明性目的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号