首页> 中文期刊> 《高技术通讯》 >基于文本挖掘的交互式专利分类

基于文本挖掘的交互式专利分类

         

摘要

This paper introduces the text mining technique into patent analysis and proposes an interactive patent classification algorithm based on multi-classifier fusion and active learning to achieve high classification performance.The algorithm first trains a sub-classifier for each class of the patents by means of support vector machine.Then,via multi-classifier fusion,the sub-classifiers are effectively combined to acquire enhanced classifiers,based on which the classification decision can be made.For refinement of the classification model,active learning is used to select the most informative patents for labeling.Finally,the dynamic batch sampling is presented to address the problem of traditional batch sampling.With dynamic certainty propagation,the selected patents become less redundant and thus more informative for active learning.The experimental results demonstrate the effectiveness of the proposed interactive patent classification algorithm based on multi-classifier fusion and active learning.%将文本挖掘理论应用于专利信息分析,提出了一种基于多分类器融合与主动学习的交互式专利分类算法,旨在实现高效的专利分类.该算法基于训练集,利用支持向量机,针对不同的专利类别分别训练相应的子分类器,然后通过多分类器融合对各子分类器进行有机结合,以获得性能更优的分类器和形成分类决策.在此基础上,利用主动学习选取最有信息的样本进行标引,从而通过人机交互实现分类模型的更新.针对传统批量选择性采样的缺点,还提出了动态批量选择性采样模式,通过确定度传播策略有效降低标引样本冗余度,以进一步提高主动学习的效率.实验结果表明,这种基于多分类器融合与主动学习的交互式专利分类算法的分类性能显著高于其他算法.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号