首页> 外文会议> >PCGEN: A Practical Approach to Projected Clustering and its Application to Gene Expression Data
【24h】

PCGEN: A Practical Approach to Projected Clustering and its Application to Gene Expression Data

机译:PCGEN:一种实用的投影聚类方法及其在基因表达数据中的应用

获取原文
获取外文期刊封面目录资料

摘要

Clustering samples in gene expression data has always been a major challenge because of the high dimensionality of the input space (typically in the tens of thousands) and the small number of samples (typically less than a hundred). Moreover, clusters may hide in subspaces with very low dimensionalities. Most existing clustering algorithms become substantially inefficient if the required similarity measure is computed between data points in the full-dimensional space. These challenges motivate our effort to propose a new and efficient partitional distance-based projected clustering algorithm for clustering samples in gene expression data. Our algorithm is capable of detecting projected clusters of extremely low dimensionality embedded in a high-dimensional space and avoids the computation of the distance in the full-dimensional space. The suitability of our proposal has been demonstrated through an empirical study using public microarray datasets.
机译:由于输入空间的维数高(通常成千上万)且样本数量少(通常少于100),因此将基因表达数据中的样本聚类一直是主要的挑战。此外,簇可能隐藏在维数很低的子空间中。如果在多维空间中的数据点之间计算所需的相似性度量,则大多数现有的聚类算法将变得效率低下。这些挑战促使我们努力提出一种新的,有效的基于分区距离的投影聚类算法,以对基因表达数据中的样本进行聚类。我们的算法能够检测嵌入在高维空间中的极低维投影聚类,并且避免了在全维空间中计算距离。我们的建议的适用性已通过使用公共微阵列数据集的实证研究得到证明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号