首页> 外文会议>Chinese Control Conference >A Method of Feature Selection Based on Word2Vec in Text Categorization
【24h】

A Method of Feature Selection Based on Word2Vec in Text Categorization

机译:一种基于文本分类中Word2VEC的特征选择方法

获取原文

摘要

In text categorization, the performance of classifier decreases with the increase of feature dimension. The main purpose of feature selection is to remove irrelevant features and redundant features in features and reduce feature dimension. Traditional methods of feature selection, such as CHI, IG, DF and so on, take into account only the number of appearances of features and ignore the feature semantics and part-of-speech features. The vector representations of words learned by word2vec models have been shown to carry semantic meanings and are useful in various NLP tasks. Based on the word vectors generated by Word2Vec, the paper proposes the algorithm Word2Vec-SM to reduce the dimensionality of the features. Experimental proof word2vec-SM algorithm.
机译:在文本分类中,分类器的性能随着特征维度的增加而降低。特征选择的主要目的是消除功能的无关功能和冗余功能,并减少特征尺寸。传统的特征选择方法,如CHI,IG,DF等,仅考虑特征的外观数量,忽略特征语义和语音部分。 Word2VEC模型学习的单词的矢量表示已被证明携带语义含义,并在各种NLP任务中有用。基于Word2VEC生成的单词向量,该文件提出了算法Word2Vec-Sm以降低特征的维度。实验证明Word2Vec-SM算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号