...
首页> 外文期刊>Journal of computational and theoretical nanoscience >Minimally Supervised Text Classification Using von Neumann Kernel
【24h】

Minimally Supervised Text Classification Using von Neumann Kernel

机译:使用von neumann内核的最低限度监督文本分类

获取原文
获取原文并翻译 | 示例

摘要

Kernel methods such as support vector machine (SVM) have become highly popular in the task of text classification. This is mainly due to their relatively high classification accuracy on several application domains as well as their ability to handle high dimensional and sparse data whichis the prohibitive characteristics of textual data representation. A significant challenge in text classification is to reduce the need for labeled training data while maintaining an acceptable performance. In this paper, we present a semi-supervised technique using the von Neumann kernelfor text classification. Specifically, the semantic similarities between terms are first determined with both labeled and unlabeled training data by means of a diffusion process on a graph defined by lexicon and co-occurrence information, and the von Neumann kernel is then constructed basedon the learned semantic similarity. Finally, the SVM classifier trains a model for each class during the training phase and this model is then applied to all test examples in the test phase. The main feature of this approach is that it takes advantage of the von Neumann kernel to reveal thesemantic similarities between terms in an unsupervised manner, which provides a kernel framework for semi-supervised learning. Experiments on several benchmark data sets demonstrate the proposed approach is sound and effective.
机译:诸如支持向量机(SVM)之类的内核方法在文本分类的任务中非常受欢迎。这主要是由于它们对多个应用领域的分类准确性相对较高,以及它们处理高维和稀疏数据的能力,这是文本数据表示的禁止特性。文本分类中的重大挑战是减少标记培训数据的需要,同时保持可接受的性能。在本文中,我们使用von neumann kernelfor文本分类来提出半监督技术。具体地,术语之间的语义相似性首先通过借助于由词典和共发生信息定义的图表上的扩散过程来确定标记和未标记的训练数据,然后构建von neumann内核的基于学习的语义相似性。最后,SVM分类器在训练阶段训练每个类的模型,然后将该模型应用于测试阶段的所有测试示例。这种方法的主要特点是,它利用了von neumann内核以无监督的方式揭示术语之间的象征类似的性,这为半监督学习提供了内核框架。几个基准数据集的实验证明了所提出的方法是声音和有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号