首页> 外文会议>AAAI Workshop >A Redundant Covering Algorithm Applied to Text Classification
【24h】

A Redundant Covering Algorithm Applied to Text Classification

机译:应用于文本分类的冗余覆盖算法

获取原文
获取外文期刊封面目录资料

摘要

Covering algorithms for learning rule sets tend toward learning concise rule sets based on the training data. This bias may not be appropriate in the domain of text classification due to the large number of informative features these domains typically contain. We present a basic covering algorithm, DAIRY, that learns unordered rule sets, and present two extensions that encourage the rule learner to milk the training data to varying degrees, by recycling covered training data, and by searching for completely redundant but highly accurate rules. We evaluate these modifications on web page and newsgroup recommendation problems and show recycling can improve classification accuracy by over 10%. Redundant rule learning provides smaller increases in most datasets, but may decrease accuracy in some.
机译:用于学习规则集的覆盖算法倾向于基于训练数据学习简明规则集。由于大量信息特征,这些域在文本分类域中可能不合适,这些域通常包含这些域。我们提出了一种基本的覆盖算法,乳制品,学习无序规则集,并呈现两个扩展,并通过回收覆盖的培训数据,并通过回收覆盖培训数据来将培训数据挤奶给不同程度的培训数据,以及搜索完全多余但高度准确的规则。我们评估这些对网页和新闻组推荐问题的修改,并显示回收可以提高分类精度超过10%。冗余规则学习在大多数数据集中提供较小的增加,但可能会降低一些数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号