首页> 外文期刊>Computational statistics & data analysis >On biological validity indices for soft clustering algorithms for gene expression data
【24h】

On biological validity indices for soft clustering algorithms for gene expression data

机译:基因表达数据软聚类算法的生物学有效性指标

获取原文
获取原文并翻译 | 示例
           

摘要

Unsupervised clustering methods such as K-means, hierarchical clustering and fuzzy c-means have been widely applied to the analysis of gene expression data to identify biologically relevant groups of genes. Recent studies have suggested that the incorporation of biological information into validation methods to assess the quality of clustering results might be useful in facilitating biological and biomedical knowledge discoveries. In this study, we generalize two bio-validity indices, the biological homogeneity index and the biological stability index, to quantify the abilities of soft clustering algorithms such as fuzzy c-means and model-based clustering. The results of an evaluation of several existing soft clustering algorithms using simulated and real data sets indicate that the soft versions of the indices provide both better precision and better accuracy than the classical ones. The significance of the proposed indices is also discussed.
机译:无监督聚类方法(例如K均值,层次聚类和模糊c均值)已广泛应用于基因表达数据分析,以鉴定生物学上相关的基因组。最近的研究表明,将生物学信息纳入验证方法中以评估聚类结果的质量可能有助于促进生物学和生物医学知识的发现。在这项研究中,我们概括了两个生物有效性指标,即生物均一性指标和生物稳定性指标,以量化软聚类算法(如模糊c均值和基于模型的聚类)的能力。使用模拟和真实数据集对几种现有的软聚类算法进行评估的结果表明,与传统的索引相比,索引的软版本提供了更好的精度和更好的准确性。还讨论了拟议指标的意义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号