首页> 外文会议>International Conference on Computer Engineering and Applications >Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification
【24h】

Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification

机译:文档分类的高效特征选择和域相关项加权方法

获取原文
获取外文期刊封面目录资料

摘要

Feature selection is of paramount concern in document Classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the "Bag of Word" BOW of the documents with term weighting phenomena. Documents representing through this model has some limitations that is, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem semantic base feature vector is proposed. That is used to extracts the concept of term, co-occurring and associated terms using ontology. The proposed method is applied on small documents dataset, which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text Classification.
机译:功能选择是文档分类过程中最重要的关注,这提高了文本分类器的效率和准确性。矢量空间模型用于代表具有术语加权现象的文件的“一词”弓。代表通过此模型的文档具有一些限制,即忽略文档中术语的阶段依赖性,结构和排序。为了克服这个问题,提出了语义基础特征向量。用于使用本体中提取术语,共同发生和相关术语的概念。所提出的方法应用于小型文件数据集,这表明该方法与文本分类的弓形特征选择方法占术语频率/逆文档频率(TF-IDF)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号