首页> 外文会议>International Conference on Recent Advances in Information Technology >Concise semantic analysis based text categorization using modified hybrid union feature selection approach
【24h】

Concise semantic analysis based text categorization using modified hybrid union feature selection approach

机译:基于简明的语义分析基于修改的混合联合会特征选择方法的文本分类

获取原文

摘要

Text categorization mainly comprises of deriving a representation of the corpus in a standard bag-of-words format. The merit of bag-of-word representations is that they considering every term as a feature, while the downside of this is that the computation cost increases with the number of features and the representation of relations between documents and features. Semantic analysis can help in gaining an edge through document and term correlation in a concept space. However, most semantic analysis techniques have their own limitations when used for text categorization. In this work, a Concise Semantic Analysis (CSA) technique that extracts concepts from corpus and then interpret the document & word relationship in a given concept space is proposed. To improve the performance of CSA, a novel feature selection technique called the Modified hybrid union (MHU) was designed, which considerably reduced computation time and cost. To experimentally validate the proposed approach, MHU based CSA was applied to the problem of text categorization. Experiments performed on standard data sets like Reuters-21578 and WSDL-TC, show that the proposed CSA with MHU approach significantly improved performance in terms of execution time and categorization accuracy.
机译:文本分类主要包括在标准的单词格式中导出语料库的表示。字袋表示的优点是他们考虑每个术语作为特征,而这一点的缺点是计算成本随着文档和特征之间的关系的特征数和表示而增加。语义分析可以帮助通过文档和概念空间中的术语相关性获得优势。但是,当用于文本分类时,大多数语义分析技术都有自己的限制。在这项工作中,提出了一个简洁的语义分析(CSA)技术从语料库中提取概念,然后提出解释给定概念空间中的文档和单词关系。为了提高CSA的性能,设计了一种新颖的特征选择技术,称为改进的混合联合会(MHU),这显着降低了计算时间和成本。为了通过实验验证所提出的方法,基于MHU的CSA应用于文本分类的问题。在Reuters-21578和WSDL-TC等标准数据集上执行的实验表明,在执行时间和分类准确性方面,所提出的CSA具有MHU方法的性能显着提高了性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号