首页> 中文期刊> 《郑州大学学报(理学版)》 >面向垃圾短信过滤的亚文档集成学习

面向垃圾短信过滤的亚文档集成学习

         

摘要

A sub-document ensemble learning (SEL) method was proposed to solve the problem of SMS spam filtering.The method used the SEL framework to break the online binary classification issue of short texts into several sub-issues,and made the final category prediction by a linear combination of several sub-results.Moreover,an effective weak classifier was implemented according to the string-frequency-index-based text classification algorithm.The experimental results showed that performances of previous text classification algorithms could be improved by the SEL framework,and the string-frequency-index-based weak classifier could achieve the state-of-the-art performance within the SEL framework.%针对垃圾短信过滤问题,提出了一种亚文档集成学习方法.该方法采用亚文档集成学习框架将短文本在线二值分类问题转化成若干个子分类问题,并通过线性组合多个子问题的分类结果得出最终的分类预测.利用基于串频索引的文本分类算法实现了一种有效的弱分类器.实验数据表明亚文档集成学习框架能够提高现有文本分类算法的效能,而在亚文档集成学习框架下,基于串频索引的弱分类器过滤效果最佳.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号