【24h】

Optimistic Active Learning using Mutual Information

机译:运用相互信息进行积极乐观的学习

获取原文

摘要

An "active learning system" will sequentially decide which unlabeled instance to label, with the goal of efficiently gathering the information necessary to produce a good classifier. Some such systems greedily select the next instance based only on properties of that instance and the few currently labeled points - e.g., selecting the one closest to the current classification boundary. Unfortunately, these approaches ignore the valuable information contained in the other unlabeled instances, which can help identify a good classifier much faster. For the previous approaches that do exploit this unlabeled data, this information is mostly used in a conservative way. One common property of the approaches in the literature is that the active learner sticks to one single query selection criterion in the whole process. We propose a system, MM+M, that selects the query instance that is able to provide the maximum conditional mutual information about the labels of the unlabeled instances, given the labeled data, in an optimistic way. This approach implicitly exploits the discriminative partition information contained in the unlabeled data. Instead of using one selection criterion, MM+M also employs a simple on-line method that changes its selection rule when it encounters an "unexpected label". Our empirical results demonstrate that this new approach works effectively.
机译:“主动学习系统”将顺序决定要标记哪个未标记实例,目的是有效地收集产生良好分类器所需的信息。一些这样的系统仅基于该实例的属性和一些当前标记的点贪婪地选择下一个实例,例如,选择最接近当前分类边界的那个。不幸的是,这些方法忽略了其他未标记实例中包含的有价值的信息,这可以帮助更快地识别出良好的分类器。对于确实利用此未标记数据的先前方法,此信息通常以保守的方式使用。文献中这些方法的一个共同特性是,主动学习者在整个过程中坚持一个单一的查询选择标准。我们提出了一个MM + M系统,该系统选择查询实例,该查询实例能够以乐观的方式在给定标记数据的情况下提供有关未标记实例的标签的最大条件互信息。这种方法隐式地利用了未标记数据中包含的区分性分区信息。 MM + M除了使用一种选择标准外,还采用一种简单的在线方法,当遇到“意外标签”时会更改其选择规则。我们的经验结果表明,这种新方法行之有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号