首页> 外文期刊>Expert systems with applications >Enhanced sparse representation classifier for text classification
【24h】

Enhanced sparse representation classifier for text classification

机译:用于文本分类的增强稀疏表示分类器

获取原文
获取原文并翻译 | 示例

摘要

Classification of text based on its substance is an essential part of analysis to organize enormously large text data and to mine the salient information contained in it. It is gaining greater attention with the surge in the volume of on-line data available. Classical algorithms like k-NN (k-nearest neighbor), SVM (Support Vector Machine) and their variations have been observed to yield only reasonable results in addressing the problem, leaving enough room for further improvement. A class of algorithms commonly referred to as Sparse Methods has been emerged recently from compressive sensing and found numerous effective applications in many areas of data analysis and image processing. Sparse Methods as a tool for text analysis is an alley that is largely unexplored rigorously. This paper presents exploration of sparse representation-based methods for text classification. Based on the success of sparse representation based methods in different areas of data analysis, we intuitively hypothesized that it should work well on text classification problems as well. This paper empirically reinforces the hypothesis by testing the method on Reuters and WebKB data sets. The empirical results on Reuters and WebKB benchmark data show that it can outperform classical classification algorithms like SVM and k-NN. It has been observed that obtaining the basis of representation and sparse codes are computationally costly operations affecting the performance of the system. We also propose a class-wise dictionary refinement algorithm and dynamic dictionary selection algorithm to make sparse coding faster. The addition of dictionary refinement to the classification system not only reduces the time taken for sparse coding but also gives improved classification accuracy. The outcomes of the study are empirical verification of sparse representation classifier as a text classification tool and a computationally efficient solution for the bottleneck operation of sparse coding. (C) 2019 Elsevier Ltd. All rights reserved.
机译:基于其实质的文本分类是分析的重要组成部分,以便组织大型文本数据并挖掘其中包含的突出信息。它正在增加可用的在线数据量的激增。已经观察到K-NN(k最近邻居),SVM(支持向量机)等经典算法及其变化,以产生应合理的结果解决问题,以便进一步改进足够的空间。最近从压缩感测中出现了一类通常被称为稀疏方法的算法,并发现了许多数据分析和图像处理的有效应用。作为文本分析的工具的稀疏方法是一个小巷,这主要是严格探索的。本文介绍了基于稀疏表示的文本分类方法。基于基于稀疏表示的基于不同区域分析的方法的成功,直观地假设它应该在文本分类问题上很好地工作。本文通过在路透社和WebKB数据集上测试方法来凭经验,对假设进行了强度。路透社和WebKB基准数据的实证结果表明它可以胜过等级分类算法,如SVM和K-Nn。已经观察到,获得表示和稀疏代码的基础是影响系统性能的计算成本高昂的操作。我们还提出了一种类关于虚拟词典细化算法和动态词典选择算法,使得稀疏编码更快。向分类系统添加字典细化不仅可以减少稀疏编码所需的时间,而且还提供了改进的分类准确性。该研究的结果是稀疏表示分类器作为文本分类工具的实证验证以及稀疏编码的瓶颈操作的计算有效解决方案。 (c)2019 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号