首页> 外文会议>IEEE GCC Conference and Exhibition >A new text representation scheme combining Bag-of-Words and Bag-of-Concepts approaches for automatic text classification
【24h】

A new text representation scheme combining Bag-of-Words and Bag-of-Concepts approaches for automatic text classification

机译:一种结合了词袋和概念袋方法的新文本表示方案,用于自动文本分类

获取原文

摘要

This paper introduces a new approach to creating text representations and apply it to a standard text classification collections. The approach is based on supplementing the well-known Bag-of-Words (BOW) representational scheme with a concept-based representation that utilises Wikipedia as a knowledge base. The proposed representations are used to generate a Vector Space Model, which in turn is fed into a Support Vector Machine classifier to categorise a collection of textual documents from two publically available datasets. Experimental results for evaluating the performance of our model in comparison to using a standard BOW scheme and a concept-based scheme, as well as recently reported similar text representations that are based on augmenting the standard BOW approach with concept-based representations.
机译:本文介绍了一种创建文本表示形式并将其应用于标准文本分类集合的新方法。该方法基于对著名的词袋(BOW)表示方案进行补充的基于概念的表示,该概念利用了Wikipedia作为知识库。提出的表示用于生成向量空间模型,然后将其馈入支持向量机分类器,以对来自两个公共可用数据集的文本文档集合进行分类。与使用标准BOW方案和基于概念的方案相比,评估我们模型的性能的实验结果,以及最近报告的类似文本表示形式,这些文本表示都是基于标准BOW方法与基于概念的表示形式的增强。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号