首页> 外文期刊>Pattern recognition letters >Generative classification model for categorical data based on latent Gaussian process
【24h】

Generative classification model for categorical data based on latent Gaussian process

机译:基于潜在高斯过程的分类数据生成分类模型

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In many machine learning applications such as computer-aided diagnosis, gene sequence analysis or natural language processing, categorical data appears. For small-scale data set with high dimensions, since relatively small proportion of possible categorical configurations are covered by training samples, conventional methods based on frequency information such as Dirichlet Compound Multinomial distribution usually runs into problems of over-fitting. Latent gaussian process is an effective bayesian non-parametric technique for categorical data modeling, which was proposed as an unsupervised method to embed unlabelled categorical data into a continuous and low-dimensional space through gaussian process. As a probabilistic generative model, latent gaussian process owns the ability of density estimation. In this paper, we propose a generative classification model as a supervised method for labelled categorical data, in which we use latent gaussian process to estimate the class-conditional densities. Since the complexity of gaussian process model can adapt to the size of training data, our method is able to effectively model small-sale categorical data. Experimental results show that our proposal can achieve better classification performance compared with other classification models for categorical data. (C) 2017 Elsevier B.V. All rights reserved.
机译:在许多机器学习应用中,例如计算机辅助诊断,基因序列分析或自然语言处理,都会出现分类数据。对于具有高维的小规模数据集,由于训练样本覆盖了相对较小比例的可能类别配置,因此基于频率信息的常规方法(例如Dirichlet复合多项式分布)通常会出现过度拟合的问题。潜在的高斯过程是一种有效的用于分类数据建模的贝叶斯非参数技术,它是通过高斯过程将未标记的分类数据嵌入到连续的低维空间中的一种无监督方法。高斯过程作为一种概率生成模型,具有密度估计的能力。在本文中,我们提出了一种生成分类模型作为带标签分类数据的监督方法,在该模型中,我们使用了潜在的高斯过程来估计类条件密度。由于高斯过程模型的复杂性可以适应训练数据的大小,因此我们的方法能够有效地对小规模分类数据建模。实验结果表明,与其他分类数据分类模型相比,我们的建议可以实现更好的分类性能。 (C)2017 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Pattern recognition letters》 |2017年第1期|56-61|共6页
  • 作者单位

    Univ Elect Sci & Technol China, Sch Comp Sci & Technol, Big Data Res Ctr, Chengdu 611731, Peoples R China;

    Univ Elect Sci & Technol China, Sch Comp Sci & Technol, Big Data Res Ctr, Chengdu 611731, Peoples R China;

    Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu 611731, Peoples R China;

    Univ Elect Sci & Technol China, Sch Comp Sci & Technol, Big Data Res Ctr, Chengdu 611731, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Machine learning; Data mining; Categorical data; Generative classification model; Gaussian process;

    机译:机器学习;数据挖掘;分类数据;生成分类模型;高斯过程;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号