首页> 外文期刊>Expert systems with applications >Empirical evaluation of feature projection algorithms for multi-view text classification
【24h】

Empirical evaluation of feature projection algorithms for multi-view text classification

机译:多视图文本分类特征投影算法的实证评估

获取原文
获取原文并翻译 | 示例

摘要

This study aims to propose (i) a multi-view text classification method and (ii) a ranking method that allows for selecting the best information fusion layer among many variations. Multi-view document classification is worth a detailed study as it makes it possible to combine different feature sets into yet another view that further improves text classification. For this purpose, we propose a multi-view framework for text classification that is composed of two levels of information fusion. At the first level, classifiers are constructed using different data views, i.e. different vector space models by various machine learning algorithms. At the second level, the information fusion layer uses input information using a features projection method and a meta-classifier modelled by a selected machine learning algorithm. A final decision based on classification results produced by the models positioned at the first layer is reached. Moreover, we propose a ranking method to assess various configurations of the fusion layer. We use heuristics that utilise statistical properties of F-score values calculated for classification results produced at the fusion layer. The information fusion layer of the classification framework and ranking method has been empirically evaluated. For this purpose, we introduce a use case checking whether companies' domains identify their innovativeness. The results empirically demonstrate that the information fusion layer enhances classification quality. The Friedman's aligned rank and Wilcoxon signed-rank statistical tests and the effect size support this hypothesis. In addition, the Spearman statistical test carried out for the obtained results demonstrated that the assessment made by the proposed ranking method converges to a well-established method named Hellinger - The Technique for Order Preference by Similarity to Ideal Solution (H-TOPSIS). Thus, the proposed approach may be used for the assessment of classifier performance. (C) 2019 National Information Processing Institute. Published by Elsevier Ltd.
机译:本研究旨在提出(i)一种多视图文本分类方法和(ii)一种排名方法,其允许在许多变化中选择最佳信息融合层。多视图文档分类值得详细研究,因为它可以将不同的特征集组合到另一个视图中,这进一步提高了文本分类。为此目的,我们为文本分类提出了一个多视图框架,这些框架由两个信息融合组成。在第一级别,使用不同的数据视图构建分类器,即各种机器学习算法的不同矢量空间模型。在第二级,信息融合层使用具有由所选机器学习算法建模的特征投影方法和元分类的输入信息。达到基于位于第一层位于第一层的模型产生的分类结果的最终决定。此外,我们提出了一种评估融合层的各种配置的排名方法。我们使用利用用于在融合层产生的分类结果计算的统计性质的统计特性。经验证明了分类框架和排序方法的信息融合层。为此目的,我们介绍了一种用例检查公司域是否确定其创新。结果证明信息融合层增强了分类质量。弗里德曼对齐等级等级和威尔科逊签名级别统计测试和效果大小支持这一假设。此外,为获得的结果进行的Spearman统计测试表明,所提出的排名方法所做的评估将收敛到名为Hellinger的良好方法 - 该技术通过与理想解决方案(H-Topsis)的相似性顺序优先考虑。因此,所提出的方法可用于评估分类器性能。 (c)2019国家信息加工学院。 elsevier有限公司出版

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号