首页> 外文期刊>Biomedical and Health Informatics, IEEE Journal of >Exploring Active Learning Based on Representativeness and Uncertainty for Biomedical Data Classification
【24h】

Exploring Active Learning Based on Representativeness and Uncertainty for Biomedical Data Classification

机译:基于代表性和不确定性的主动学习对生物医学数据分类的探索

获取原文
获取原文并翻译 | 示例
       

摘要

Nowadays, there is an abundance of biomedical data, such as images and genetic sequences, among others. However, there is a lack of annotation to such volume of data, due to the high costs involved to perform this task. Thus, it is mandatory to develop techniques to ease the burden of human annotation. To reach such goal active learning strategies can be applied. However, the state-of-the-art active learning methods, generally, are not feasible to lead with real-world datasets. Another important issue, that is generally neglected by these methods, is related to the conception that the classifier tends to learn more and more at each iteration. Their adopted selection criteria do not properly exploit the knowledge of the classifier. Therefore, in this paper, we propose the use of an active learning approach, in order to leverage the learning process, including the proposal of a novel active learning strategy. The main difference of our proposed strategy is related to the participation of the classifier in an extremely active way in its learning process. So, we can better maximize and prioritize the knowledge that is obtained by the classifier at each iteration, making use of this knowledge in a more appropriate and useful way when selecting more informative samples. To do so, in our selection criteria, we give significant importance to the classifications suggested by the classifier. In addition, jointly with the participation and the knowledge of the classifier, we consider both uncertainty and representativeness criteria through a fine-grained analysis of the samples. Experimental results show that our novel active learning approach outperforms state-of-the-art active learning methods, considering several supervised classifiers. Hence, dealing with real dataset problems in a better way, equalizing the tradeoff between annotation task and higher accuracy rates.
机译:如今,有大量的生物医学数据,例如图像和遗传序列等。但是,由于执行此任务的成本较高,因此缺少对此类数据的注释。因此,必须开发减轻人类注释负担的技术。为了达到这样的目标,可以采用主动学习策略。但是,通常,采用最新的主动学习方法来引导现实世界的数据集是不可行的。这些方法通常忽略的另一个重要问题与分类器倾向于在每次迭代中学习越来越多的概念有关。他们采用的选择标准不能正确利用分类器的知识。因此,在本文中,我们建议使用主动学习方法,以利用学习过程,包括提出一种新颖的主动学习策略。我们提出的策略的主要区别在于分类器以一种非常积极的方式参与其学习过程。因此,我们可以更好地最大化和优先化分类器在每次迭代中获得的知识,并在选择更多信息样本时以更适当和有用的方式利用这些知识。为此,在我们的选择标准中,我们非常重视分类器建议的分类。此外,结合分类器的参与和知识,我们通过对样本进行细粒度分析来考虑不确定性和代表性标准。实验结果表明,考虑到多个监督分类器,我们新颖的主动学习方法优于最新的主动学习方法。因此,以更好的方式处理实际数据集问题,使注释任务与更高的准确率之间的权衡平衡。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号