首页> 外文会议>International Symposium on Information Technology >Semantic Based Features Selection and Weighting Method for Text Classification
【24h】

Semantic Based Features Selection and Weighting Method for Text Classification

机译:基于语义的特征选择和文本分类的加权方法

获取原文

摘要

Feature selection and weighting is of vital concern in text classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the documents using "Bag of Word" BOW model with term weighting phenomena. Documents representation through this model has some limitations that are, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem, Semantics Base Feature Vector using Part of Speech (POS), is proposed, which is used to extract the concept of terms using WordNet, co-occurring and associated terms. The proposed method is applied on small documents dataset which shows that this method outperforms then term frequency/inverse document frequency (TF-IDF) with BOW feature selection method for text classification.
机译:特征选择和加权在文本分类过程中是至关重要的,这提高了文本分类器的效率和准确性。矢量空间模型用于使用具有术语加权现象的“一词”弓形模型来代表文件。通过此模型的文档表示具有一些限制,即忽略文档中术语的阶段依赖性,结构和排序。为了克服这个问题,提出了使用部分语音(POS)的语义基本特征向量,用于使用Wordnet,共同发生和关联的术语提取术语的概念。所提出的方法应用于小型文件数据集,该方法表明该方法始终呈现术语频率/逆文档频率(TF-IDF),具有用于文本分类的弓形特征选择方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号