【24h】

Co-training from an Incremental EM Perspective

机译:来自增量EM透视的共同培训

获取原文

摘要

We study classification when the majority of data is unlabeled, and only a small fraction is labeled: the so-called semi-supervised learning situation. Blum and Mitchell's co-training is a popular semi-supervised algorithm [1] to use when we have multiple independent views of the entities to classify. An example of a multi-view situation is classifying web pages: one view may describe the pages by the words that occur on them, another view describes the pages by the words in the hyperlinks that point to them. In co-training two learners each form a model from the labeled data and then incrementally label small subsets of the unlabeled data for each other. The learners then re-estimate their model from the labeled data and the psuedo-labels provided by the learners. Though some analysis of the algorithm's performance exists [1] the computation performed is still not well understood. We propose that each view in co-training is effectively performing incremental EM as postulated by Neal and Hinton [3], combined with a Bayesian classifier. This analysis suggests improvements over the core co-training algorithm. We introduce variations, which result in faster convergence to the maximum possible accuracy of classification than the core co-training algorithm, and therefore increase the learning efficiency. We empirically verify our claim for a number of data sets in the context of belief network learning.
机译:我们研究分类当大多数数据未标记时,只标有小部分:所谓的半监督学习情况。 Blum和Mitchell的共同培训是一种流行的半监督算法[1],我们有多个独立视图的实体进行分类。多视图情况的示例是对网页进行分类:一个视图可以通过它们上发生的单词来描述页面,另一个视图通过指向它们的超链接中的单词描述页面。在共同培训两位学习者中,每个学习者从标记数据中形成模型,然后逐步逐步标记未标记数据的小子集。然后,学习者从标签的数据和学习者提供的PSUEDO-LABEL重新估算他们的模型。虽然存在对算法的性能的一些分析存在[1]所执行的计算仍未得到很好的理解。我们建议在共同培训中的每个观点都有有效地执行尼尔和亨内琴[3]假设的增量EM,与贝叶斯分类器相结合。该分析表明,对核心共同训练算法的改进。我们引入了变化,这导致比核心共同训练算法的分类最大可能精度更快,从而提高了学习效率。我们在信仰网络学习的背景下验证我们对许多数据集的索赔。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号