【24h】

Large Margin vs. Large Volume in TransductiveLearning

机译:跨语言学习中的大利润与大数量

获取原文

摘要

We focus on distribution-free transductive learning. In this setting the learning algorithm is given a 'full sample' of unlabeled points. Then, a training sample is selected uniformly at random from the full sample and the labels of the training points are revealed. The goal is to predict the labels of the remaining unlabeled points as accurately as possible. The full sample partitions the transductive hypothesis space into a finite number of equivalence classes. All hypotheses in the same equivalence class, generate the same dichotomy of the full sample. We consider a large volume principle, whereby the priority of each equivalence class is proportional to its "volume" in the hypothesis space. The large volume principle was previously treated for the case of hyperplanes. In this paper, instead of hyperplanes, we consider soft classification vectors whose set of equivalence classes w.r.t. the full sample contains all possible dichotomies. Symmetry is broken by generating equivalence classes of non-uniform volume, defined via a non axis aligned data-dependent ellipsoid. Since exact or quantifiable approximate volume estimation is computationally hard, we resort to a cruder approach whereby volume is crudely related to the angles between hypotheses and the principal axes of the ellipsoid. This approach makes sense because long principal axes lie in regions of large volume. Our construction leads to a family of transductive algorithms and here we focus on one instantiation. Although the resulting algorithm is defined in terms of a non-convex optimization problem, we develop an efficient global optimum solution using a known technique. We also derive a data-dependent error bound for this algorithm. Our experiments with the new Approximate Volume Regularization (AVR) algorithm over 31 datasets show its overwhelming advantage over TSVM and SVM in text categorization and image classification. However, on a different set of UCI datasets, TSVM and SVM are significantly superior to AVR. We identify some factors that influence the success and failure of our algorithm. One interesting observation is that AVR has significant advantage over TSVM when TSVM outperforms SVM, and vice versa.
机译:我们专注于无分布的跨语言学习。在这种设置下,学习算法将获得未标记点的“完整样本”。然后,从全部样本中随机均匀地选择训练样本,并显示训练点的标签。目的是尽可能准确地预测其余未标记点的标记。完整样本将转导假设空间划分为有限数量的等价类。同一等价类中的所有假设都对整个样本产生相同的二分法。我们考虑一个大容量原理,其中每个等价类的优先级与其在假设空间中的“体积”成正比。大体积原理先前已针对超平面情况进行了处理。在本文中,我们将使用等价类集为w.r.t的软分类矢量代替超平面。完整的样本包含所有可能的二分法。通过生成非均匀体积的等价类(通过非轴对齐的数据相关椭球体定义),打破了对称性。由于精确或可量化的近似体积估计在计算上比较困难,因此我们采用一种较粗略的方法,其中体积与假设和椭圆形主轴之间的角度粗略相关。这种方法很有意义,因为长主轴位于大体积区域中。我们的构造导致了一系列转导算法,在这里我们集中于一个实例化。尽管根据非凸优化问题定义了所得算法,但我们使用已知技术开发了有效的全局最优解。我们还为该算法导出了与数据相关的错误界限。我们对31个数据集使用新的近似体积正则化(AVR)算法进行的实验表明,在文本分类和图像分类方面,它比TSVM和SVM具有压倒性的优势。但是,在一组不同的UCI数据集上,TSVM和SVM明显优于AVR。我们确定了一些影响算法成功和失败的因素。一个有趣的观察结果是,当TSVM优于SVM时,AVR优于TSVM,反之亦然。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号