首页> 中文期刊> 《计算机工程与设计》 >基于随机森林的正例与未标注学习

基于随机森林的正例与未标注学习

         

摘要

To use positive and unlabeled examples to train the classifier (PU learning) ,an algorithm based on the random forest was proposed .The PU decision tree algorithm POSC4.5 was extended to deal with the random feature selection when a tree was growing .In the training phrase ,sampling with replacement on the original PU dataset was adopted to generate multiple different PU training datasets and multiple trees were trained on these datasets using extended POSC4.5 .In the classification phrase ,the outputs of the trained trees were aggregated using the majority vote .Experimental results on UCI data sets show that the classi‐fication performance of the method proposed is better than that of the biased support vector machine ,the POSC4.5 and the bag‐ging POSC4.5 .%为使用正例与未标注数据训练分类器(positive and unlabeled learning , PU learning),提出基于随机森林的PU学习算法。对POSC4.5算法进行扩展,在其生成决策树的过程中加入随机特征选择;在训练阶段,使用有放回抽样技术对PU数据集抽样,生成多个不同的PU训练集,并以其训练扩展后的 POSC4.5算法,构造多棵决策树;在分类阶段,采用多数投票策略集成各决策树输出。在 UCI数据集上的实验结果表明,该算法的分类性能优于偏置支持向量机算法、POS4.5算法和基于装袋技术的POSC4.5算法。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号