【24h】

Active Local Learning

机译:积极的本地学习

获取原文
       

摘要

In this work we consider active {em local learning}: given a query point $x$, and active access to an unlabeled training set $S$, output the prediction $h(x)$ of a near-optimal $h in H$ using significantly fewer labels than would be needed to actually learn $h$ fully. In particular, the number of label queries should be independent of the complexity of $H$, and the function $h$ should be well-defined, independent of $x$. This immediately also implies an algorithm for {em distance estimation}: estimating the value $opt(H)$ from many fewer labels than needed to actually learn a near-optimal $h in H$, by running local learning on a few random query points and computing the average error. For the hypothesis class consisting of functions supported on the interval $[0,1]$ with Lipschitz constant bounded by $L$, we present an algorithm that makes $O(({1 / epsilon^6}) log(1/epsilon))$ label queries from an unlabeled pool of $O(({L / epsilon^4})log(1/epsilon))$ samples. It estimates the distance to the best hypothesis in the class to an additive error of $epsilon$ for an arbitrary underlying distribution. We further generalize our algorithm to more than one dimensions. We emphasize that the number of labels used is independent of the complexity of the hypothesis class which is linear in $L$ in the one-dimensional case. Furthermore, we give an algorithm to locally estimate the values of a near-optimal function at a few query points of interest with number of labels independent of $L$. We also consider the related problem of approximating the minimum error that can be achieved by the Nadaraya-Watson estimator under a linear diagonal transformation with eigenvalues coming from a small range. For a $d$-dimensional pointset of size $N$, our algorithm achieves an additive approximation of $epsilon$, makes $ilde{O}({d}/{epsilon^2})$ queries and runs in $ilde{O}({d^2}/{epsilon^{d+4}}+{dN}/{epsilon^2})$ time.
机译:在这项工作中,我们考虑有效{ em本地学习}:给定查询点$ x $,并主动访问未标记的训练设置$ s $,输出预测$ h(x)$近最佳$ h 使用比实际学习$ H $实际学习的标签明显更少的标签。特别是,标签查询的数量应与$ H $的复杂性无关,并且函数$ H $应该是明确定义的,而独立于$ x $。此目的还意味着{ EM距离估计}的算法:估计价值$ OPT(H)$的数量,而不是实际学习近乎最佳$ H 以H $实际学习近乎最佳的$ H $,以外的几个随机查询点和计算平均错误。对于由Interval $ [0,1] $支持的函数组成的假设类,Lipschitz常数由$ l $界定,我们提出了一种使$ O的算法(({1 / epsilon ^ 6}) log(1 / epsilon))$ o的未标记池的标签查询(({l / epsilon ^ 4}) log(1 / epsilon))$ samples。它估计课堂上最佳假设的距离,以获得任意底层分布的$ epsilon $的添加剂误差。我们进一步概括了我们的算法到多个维度。我们强调所使用的标签数量独立于假设类的复杂性,这是一维案例中的$ l $的线性。此外,我们给出了算法在本地估计几个兴趣点的近最佳功能的值,与$ L $无关的标签数。我们还考虑近似于Nadaraya-Watson估计器在线性对角线变换下可以实现的最小误差的相关问题,其中特征值来自来自小范围的特征值。对于$ d $ -dimensional of size $ n $,我们的算法达到了$ epsilon $的附加近似值,使$ tilde {o}({d} / { epsilon ^ 2})$查询和运行$ tilde {o}({d ^ 2} / { epsilon ^ {d + 4}} + {dn} / { epsilon ^ 2})$ time。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号