首页> 外文期刊>PLoS Computational Biology >Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations on single cell data
【24h】

Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations on single cell data

机译:专注于专家变分性AutoEncoder用于聚类和生成单个小区数据的相似性的表示

获取原文
           

摘要

Clustering high-dimensional data, such as images or biological measurements, is a long-standing problem and has been studied extensively. Recently, Deep Clustering has gained popularity due to its flexibility in fitting the specific peculiarities of complex data. Here we introduce the Mixture-of-Experts Similarity Variational Autoencoder (MoE-Sim-VAE), a novel generative clustering model. The model can learn multi-modal distributions of high-dimensional data and use these to generate realistic data with high efficacy and efficiency. MoE-Sim-VAE is based on a Variational Autoencoder (VAE), where the decoder consists of a Mixture-of-Experts (MoE) architecture. This specific architecture allows for various modes of the data to be automatically learned by means of the experts. Additionally, we encourage the lower dimensional latent representation of our model to follow a Gaussian mixture distribution and to accurately represent the similarities between the data points. We assess the performance of our model on the MNIST benchmark data set and challenging real-world tasks of clustering mouse organs from single-cell RNA-sequencing measurements and defining cell subpopulations from mass cytometry (CyTOF) measurements on hundreds of different datasets. MoE-Sim-VAE exhibits superior clustering performance on all these tasks in comparison to the baselines as well as competitor methods.
机译:聚类高维数据(例如图像或生物测量)是一个长期存在的问题,并且已经广泛研究。最近,由于其灵活性拟合了复杂数据的特定特性,但深度集群已经受到普及。在这里,我们介绍了专注于专家的相似性变分性AutoEncoder(MoE-SIM-VAE),这是一种新型生成聚类模型。该模型可以学习高维数据的多模态分布,并使用这些来产生具有高功效和效率的逼真数据。 MOE-SIM-VAE基于变形式自动化器(VAE),其中解码器包括专家混合(MOE)架构。该特定架构允许通过专家自动学习数据的各种模式。此外,我们鼓励我们模型的较低维度潜在表示,以遵循高斯混合分布,并准确地代表数据点之间的相似之处。我们评估我们在Mnist基准数据集中的模型的表现,以及从单细胞RNA测序测量的聚类小鼠器官的真实世界任务,并从大量细胞测定法(CYTOF)测量数百个不同的数据集中定义细胞亚群。与基线以及竞争对手方法相比,Moe-Sim-VAE在所有这些任务中表现出优异的聚类性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号