首页> 外文期刊>Bioinformatics >A comparison of cluster analysis methods using DNA methylation data
【24h】

A comparison of cluster analysis methods using DNA methylation data

机译:使用DNA甲基化数据进行聚类分析方法的比较

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Aberrant DNA methylation is common in cancer. DNA methylation profiles differ between tumor types and subtypes and provide a powerful diagnostic tool for identifying clusters of samples and/or genes. DNA methylation data obtained with the quantitative, highly sensitive MethyLight technology is not normally distributed; it frequently contains an excess of zeros. Established tools to analyze this type of data do not exist. Here, we evaluate a variety of methods for cluster analysis to determine which is most reliable. Results: We introduce a Bernoulli–lognormal mixture model for clustering DNA methylation data obtained using MethyLight. We model the outcomes using a two-part distribution having discrete and continuous components. It is compared with standard cluster analysis approaches for continuous data and for discrete data. In a simulation study, we find that the two-part model has the lowest classification error rate for mixture outcome data compared with other approaches. The methods are illustrated using DNA methylation data from a study of lung cancer cell lines. Compared with competing hierarchical clustering methods, the mixture model approaches have the lowest cross-validation error for detecting lung cancer subtype (non-small versus small cell). The Bernoulli–lognormal mixture assigns observations to subgroups with the lowest uncertainty.
机译:动机:异常的DNA甲基化在癌症中很常见。 DNA甲基化谱在肿瘤类型和亚型之间有所不同,并为鉴定样品和/或基因簇提供了强大的诊断工具。用定量的高度敏感的MethyLight技术获得的DNA甲基化数据不是正态分布的;它通常包含多余的零。不存在用于分析此类数据的已建立工具。在这里,我们评估各种用于聚类分析的方法,以确定哪种方法最可靠。结果:我们引入了伯努利-对数正态混合模型来聚类使用MethyLight获得的DNA甲基化数据。我们使用具有离散和连续成分的两部分分布对结果进行建模。将其与标准聚类分析方法进行比较,以获取连续数据和离散数据。在模拟研究中,我们发现与其他方法相比,两部分模型对混合结果数据的分类错误率最低。使用肺癌细胞系研究中的DNA甲基化数据说明了这些方法。与竞争性分层聚类方法相比,混合模型方法在检测肺癌亚型(非小细胞与小细胞)中具有最低的交叉验证误差。伯努利-对数正态混合将观测值分配给不确定性最低的子组。

著录项

  • 来源
    《Bioinformatics》 |2004年第12期|p. 1896-1904|共9页
  • 作者单位

    Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90089, USA;

    Department of Surgery, Norris Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA;

    Department of Surgery, Norris Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA;

  • 收录信息 美国《科学引文索引》(SCI);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物科学;
  • 关键词

  • 入库时间 2022-08-17 23:50:19

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号