首页> 外文会议>IEEE International Conference on Data Mining >Clustering on Sparse Data in Non-overlapping Feature Space with Applications to Cancer Subtyping
【24h】

Clustering on Sparse Data in Non-overlapping Feature Space with Applications to Cancer Subtyping

机译:非重叠特征空间中稀疏数据的聚类及其在癌症分型中的应用

获取原文

摘要

This paper presents a new algorithm, Reinforced and Informed Network-based Clustering(RINC), for finding unknown groups of similar data objects in sparse and largely non-overlapping feature space where a network structure among features can be observed. Sparse and non-overlapping unlabeled data become increasingly common and available especially in text mining and biomedical data mining. RINC inserts a domain informed model into a modelless neural network. In particular, our approach integrates physically meaningful feature dependencies into the neural network architecture and soft computational constraint. Our learning algorithm efficiently clusters sparse data through integrated smoothing and sparse auto-encoder learning. The informed design requires fewer samples for training and at least part of the model becomes explainable. The architecture of the reinforced network layers smooths sparse data over the network dependency in the feature space. Most importantly, through back-propagation, the weights of the reinforced smoothing layers are simultaneously constrained by the remaining sparse auto-encoder layers that set the target values to be equal to the raw inputs. Empirical results demonstrate that RINC achieves improved accuracy and renders physically meaningful clustering results.
机译:本文提出了一种新的算法,即基于信息网络的增强聚类(RINC),用于在稀疏且很大程度上不重叠的特征空间中发现相似数据对象的未知组,在该空间中可以观察到特征之间的网络结构。稀疏且不重叠的未标记数据变得越来越普遍,尤其是在文本挖掘和生物医学数据挖掘中。 RINC将领域通知模型插入无模型神经网络。特别是,我们的方法将物理上有意义的特征相关性集成到了神经网络体系结构和软计算约束中。我们的学习算法通过集成的平滑和稀疏自动编码器学习有效地对稀疏数据进行聚类。明智的设计需要较少的样本进行训练,并且至少部分模型可以解释。增强型网络层的体系结构可平滑特征空间中依赖网络的稀疏数据。最重要的是,通过反向传播,增强的平滑层的权重同时受到剩余稀疏自动编码器层的约束,这些稀疏自动编码器层将目标值设置为等于原始输入。实验结果表明,RINC可以提高准确性,并提供物理上有意义的聚类结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号