首页> 外文会议>International conference on swarm, evolutionary, and memetic computing >Text Classification Using Ensemble Features Selection and Data Mining Techniques
【24h】

Text Classification Using Ensemble Features Selection and Data Mining Techniques

机译:使用集成特征选择和数据挖掘技术进行文本分类

获取原文

摘要

Text categorization is a task of text mining/analytics which involves extracting useful information from unstructured resources followed by categorizing these documents. In this paper, we classify the TechTC dataset collected from various Web directories. We employed feature selection methods such as Gini index, chi-square, t-statistic, correlation which drastically reduced the model building time. Various neural network models such as probabilistic neural network, group method of data handling, multi layer perceptron yielded higher accuracies compared to other techniques applied in literature.
机译:文本分类是文本挖掘/分析的任务,涉及从非结构化资源中提取有用的信息,然后对这些文档进行分类。在本文中,我们对从各种Web目录收集的TechTC数据集进行分类。我们采用了特征选择方法,例如基尼系数,卡方,t统计量,相关性,从而大大缩短了模型构建时间。与文献中应用的其他技术相比,各种神经网络模型(例如概率神经网络,数据处理的分组方法,多层感知器)产生了更高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号