首页> 中文期刊>计算机科学与探索 >基于类标签聚类的动态问题分类集成学习算法

基于类标签聚类的动态问题分类集成学习算法

     

摘要

问题分类是问答社区系统的关键技术,分析用户提出的自然语言问题,并返回一个确切而适当的问题类别.针对网络社区中问题分类标签众多(>1 000)、有一定层次且易受时间演化影响的问题,提出了针对两种不同流动粒度的问题分类算法,运用不同时刻的数据集层次集成学习方法提高了问题分类精度和效率.同时,针对单次分类标签过多引起的特征集混淆问题,将已有层次的分类标签树基于基分类器错误率和混淆矩阵进行聚类,进一步提高了问题分类的精度和效率.%Being key step of the community question answer system, question classification analyzes natural lan guage questions and returns specified and proper categories. Concerning the problems of network community, such as large taxonomies of categories (>1 000), label hierarchy and vulnerability to time evolution, this paper proposes two different drifting granularity methods, and uses ensemble learning of classifiers built with data in different moments, which improves accuracy and efficiency evidently. Moreover, in view of feature set confusion problem caused by overabundant class labels in one base classifier, the paper proposes a plus enhancer that clusters class labels based on error rate of base classifiers and confusion matrix, which raises classification accuracy further.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号