Learning Probabilistic Linear-Threshold Classifiers via Selective Sampling

机译：通过选择性采样学习概率线性阈值分类器

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we investigate selective sampling, a learning model where the learner observes a sequence of i.i.d. unlabeled instances each time deciding whether to query the label of the current instance. We assume that labels are binary and stochastically related to instances via a linear probabilistic function whose coefficients are arbitrary and unknown. We then introduce a new selective sampling rule and show that its expected regret (with respect to the classifier knowing the underlying linear function and observing the label realization after each prediction) grows not much faster than the number of sampled labels. Furthermore, under additional assumptions on the true margin distribution, we prove that the number of sampled labels grows only logarithmically in the number of observed instances. Experiments carried out on a text categorization problem show that: (1) our selective sampling algorithm performs better than the Perceptron algorithm even when the latter is given the true label after each classification; (2) when allowed to observe the true label after each classification, the performance of our algorithm remains the same. Finally, we note that by expressing our selective sampling rule in dual variables we can learn nonlinear probabilistic functions via the kernel machinery.

机译：在本文中，我们研究了选择性抽样，这是一种学习模型，学习者可以在其中观察i.i.d序列。每次决定是否查询当前实例的标签时，所有未标记的实例。我们假设标签是二进制的，并且通过线性概率函数与实例随机相关，该线性概率函数的系数是任意的且未知。然后，我们引入了一种新的选择性采样规则，并表明它的预期后悔（相对于分类器了解基本线性函数并在每次预测后观察标签实现）的增长速度并不比采样标签的数量快得多。此外，在关于真实边距分布的其他假设下，我们证明了采样标签的数量仅在观察到的实例数量上呈对数增长。针对文本分类问题进行的实验表明：（1）即使在每次分类后为Perceptron算法提供了真实的标签，我们的选择性采样算法也比Perceptron算法具有更好的性能；（2）在每次分类后允许观察真实标签时，我们算法的性能保持不变。最后，我们注意到，通过在双变量中表达选择性抽样规则，我们可以通过内核机制学习非线性概率函数。

著录项

来源
《16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003 Aug 24-27, 2003 Washington, DC, USA》|2003年|p.373-387|共15页
会议地点 Washington DC(US);Washington DC(US);Washington DC(US);Washington DC(US);Washington DC(US);Washington DC(US)
作者
Nicolo Cesa-Bianchi; Alex Conconi; Claudio Gentile;
展开▼
作者单位

Dept. of Information Technologies Universita degli Studi di Milano, Crema, Italy;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Learning noisy linear classifiers via adaptive and selective sampling [J] . Giovanni Cavallanti, Nicolo Cesa-Bianchi, Claudio Gentile Machine Learning . 2011,第1期

机译：通过自适应和选择性采样学习噪声线性分类器
2. Probabilistic classifier: generated using randomised sub-sampling of the feature space [J] . Jonathan D Tyzack, Hamse Y Mussa, Robert C Glen Journal of Cheminformatics . 2012,第S1期

机译：概率分类器：使用特征空间的随机子采样生成
3. Committee-Based Sample Selection for Probabilistic Classifiers [J] . Argamon-Engelson S., Dagan I. The Journal of Artificial Intelligence Research . 1999,第7期

机译：基于委员会的概率分类器样本选择
4. Learning Probabilistic Linear-Threshold Classifiers via Selective Sampling [C] . Nicolo Cesa-Bianchi, Alex Conconi, Claudio Gentile, Annual Conference on Learning Theory . 2003

机译：通过选择性采样学习概率线性阈值分类器
5. Learning Probabilistic Generative Models for Fast Sampling-based Planning [D] . Huh, Jinwook. 2019

机译：基于快速采样规划的学习概率生成模型
6. Probabilistic classifier: generated using randomised sub-sampling of the feature space [O] . Jonathan D Tyzack, Hamse Y Mussa, Robert C Glen 2012

机译：概率分类器：使用特征空间的随机子采样生成
7. Learning Noisy Linear Classifiers via Adaptive and Selective Sampling [O] . Giovanni Cavallanti, Nicolò Cesa-bianchi, Claudio Gentile 2013

机译：通过自适应和选择性采样学习噪声线性分类器

Learning Probabilistic Linear-Threshold Classifiers via Selective Sampling

摘要

著录项

相似文献

相关主题

期刊订阅