首页> 外文期刊>International Journal of Approximate Reasoning >A rough set-based case-based reasoner for text categorization
【24h】

A rough set-based case-based reasoner for text categorization

机译:基于粗糙集,基于案例的推理器,用于文本分类

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a novel rough set-based case-based reasoner for use in text categorization (TC). The reasoner has four main components: feature term extractor, document representor, case selector, and case retriever. It operates by first reducing the number of feature terms in the documents using the rough set technique. Then, the number of documents is reduced using a new document selection approach based on the case-based reasoning (CBR) concepts of coverage and reachability. As a result, both the number of feature terms and documents are reduced with only minimal loss of information. Finally, this smaller set of documents with fewer feature terms is used in TC. The proposed rough set-based case-based reasoner was tested on the Reuters21578 text data-sets. The experimental results demonstrate its effectiveness and efficiency as it significantly reduced feature terms and documents, important for improving the efficiency of TC, while preserving and even improving classification accuracy.
机译:本文提出了一种用于文本分类(TC)的新颖的基于粗糙集的基于案例的推理器。推理机具有四个主要组件:特征项提取器,文档表示器,案例选择器和案例检索器。它首先通过使用粗糙集技术减少文档中特征项的数量来进行操作。然后,基于覆盖和可及性的基于案例的推理(CBR)概念,使用新的文档选择方法来减少文档的数量。结果,减少了特征项和文档的数量,而仅损失了很少的信息。最后,在TC中使用了具有较少功能术语的较小文档集。在Reuters21578文本数据集上对提出的基于粗糙集的基于案例的推理器进行了测试。实验结果证明了它的有效性和效率,因为它显着减少了特征术语和文档,这对于提高TC的效率,同时保留甚至提高分类准确性非常重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号