【24h】

2-PS Based Associative Text Classification

机译:基于2-PS的关联文本分类

获取原文

摘要

Recent studies reveal that associative classification can achieve higher accuracy than traditional approaches. The main drawback of this approach is that it generates a huge number of rules, which makes it difficult to select a subset of rules for accurate classification. In this study, we propose a novel association-based approach especially suitable for text classification. The approach first builds a classifier through a 2-PS (Two-Phase) method. The first phase aims for pruning rules locally, i.e., rules mined within every category are pruned by a sentence-level constraint, and this makes the rules more semantically correlated and less redundant. In the second phase, all the remaining rules are compared and selected with a global view, i.e., training examples from different categories are merged together to evaluate these rules. Moreover, when labeling a new document, the multiple sentence-level appearances of a rule are taken into account. Experimental results on the well-known text corpora show that our method can achieve higher accuracy than many well-known methods. In addition, the performance study shows that our method is quite efficient in comparison with other classification methods.
机译:最近的研究表明,联想分类可以比传统方法实现更高的准确性。这种方法的主要缺点是它产生了大量规则,这使得难以选择准确分类的规则子集。在本研究中,我们提出了一种基于新的基于协会的方法,特别适用于文本分类。该方法首先通过2 ps(两相)方法构建分类器。第一阶段旨在本地修剪规则,即,在每个类别中所开采的规则被句子级约束修剪,这使得规则更加语义相关,更冗余。在第二阶段,将所有剩余规则进行比较和选择,并使用全局视图选择,即来自不同类别的培训示例将合并在一起以评估这些规则。此外,在标记新文档时,将考虑规则的多个句子级外观。众所周知的文本语料库的实验结果表明,我们的方法可以实现比许多知名方法更高的准确性。此外,性能研究表明,与其他分类方法相比,我们的方法非常有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号