首页> 外文会议>European Conference on Artificial Intelligence >Unsupervised Feature Generation using Knowledge Repositories for Effective Text Categorization
【24h】

Unsupervised Feature Generation using Knowledge Repositories for Effective Text Categorization

机译:使用知识存储库进行有效文本分类的无监督功能

获取原文

摘要

We propose an unsupervised feature generation algorithm using the repositories of human knowledge for effective text categorization. Conventional bag of words (BOW) depends on the presence / absence of keywords to classify the documents. To understand the actual context behind these keywords, we use knowledge concepts / hyperlinks from external knowledge sources through content and structure mining on Wikipedia. Then, the features of knowledge concepts are clustered to generate knowledge cluster vectors with which the input text documents are mapped into a high dimensional feature space and the classification is performed. The simulation results show that the proposed approach identifies associated features in the text collection and yields an improved classification accuracy.
机译:我们使用人类知识存储库提出了一个无监督的特征生成算法,以获得有效的文本分类。传统的单词(弓)依赖于关键字的存在/不存在来对文档进行分类。要了解这些关键字背后的实际背景,我们通过在维基百科上的内容和结构挖掘来使用来自外部知识来源的知识概念/超链接。然后,聚集知识概念的特征以生成知识群集向量,其中将输入文本文档被映射到高维特征空间并执行分类。仿真结果表明,该方法识别文本收集中的相关特征,并产生改进的分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号