首页> 外文期刊>International journal of data mining and bioinformatics >K-walks: clustering gene-expression data using a K-means clustering algorithm optimised by random walks
【24h】

K-walks: clustering gene-expression data using a K-means clustering algorithm optimised by random walks

机译:K-walks:使用随机游走优化的K-means聚类算法对基因表达数据进行聚类

获取原文
获取原文并翻译 | 示例
           

摘要

Gene-expression data obtained from the biological experiments always have thousands of dimensions, which can be very confusing and perplexing to biologists when viewed as a whole. Clustering analysis is an explorative data-mining technique for statistical data analysis that is widely used in gene-expression data analysis. Practical approaches employed for solving the clustering problem use iterative procedures such as K-means, which typically converge to one of many local minima. Here, we propose a simulated annealing approximation algorithm that is optimised using random walks to solve the K-means clustering problem. The algorithm is verified with synthetic and real-world data sets and compared with other well-known K-means variants. The new algorithm is less sensitive to initial cluster centres, and the primary strength of our algorithm is its ability to produce high-quality clustering results for thousands of high-dimensional data. However, the algorithm is computationally intensive.
机译:从生物学实验中获得的基因表达数据总是具有成千上万的维度,从整体上看,这对于生物学家来说是非常困惑和困惑的。聚类分析是一种用于统计数据分析的探索性数据挖掘技术,广泛用于基因表达数据分析中。用于解决聚类问题的实用方法使用迭代过程,例如K-means,通常会收敛到许多局部极小值之一。在这里,我们提出了一种模拟退火近似算法,该算法使用随机游走进行了优化,以解决K均值聚类问题。该算法已通过合成和真实数据集进行了验证,并与其他众所周知的K-means变体进行了比较。新算法对初始聚类中心不那么敏感,我们算法的主要优势在于它能够为数千个高维数据生成高质量的聚类结果。但是,该算法计算量大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号