首页> 外文学位 >Cluster analysis of microarray data using the in-group proportion.
【24h】

Cluster analysis of microarray data using the in-group proportion.

机译:使用组内比例对微阵列数据进行聚类分析。

获取原文
获取原文并翻译 | 示例

摘要

The cDNA microarray revolutionized the field of biology and ushered in the era of genomics and proteomics. In addition, it has the potential to change the field of statistics because the data generated from microarray experiments are very different from traditional data structures. In statistics, a typical dataset has more observations than features. A typical microarray dataset, however, has many more features than observations. Therefore, methods for analyzing this type of data are required.; A new statistic, the in-group proportion (IGP), is designed to do just that. In this dissertation it is introduced, its properties are described, and two cluster analysis methods that use the in-group proportion are presented. Although both methods may be applied to low-dimensional or high-dimensional data, they are designed specifically for microarray datasets. One method is for estimating the number of clusters present in a dataset. The other method is for statistically validating clusters found in one dataset using an independent dataset. Both methods are shown to be effective when applied to simulated and real datasets. Moreover, the latter method is applied to an extensive cDNA microarray dataset to discover and statistically validate three subtypes of breast cancer that are subsequently shown to also be biologically valid.; These methods and the results are somewhat crude. They are intended to be a starting point upon which more sophisticated procedures and analyses are to be based. Nevertheless, as shown in this dissertation, the in-group proportion and these generally-applicable methods can play important roles, especially prior to the development of more theoretically rigorous statistics and tests.
机译:cDNA微阵列彻底改变了生物学领域,并迎来了基因组学和蛋白质组学的时代。另外,它有可能改变统计领域,因为从微阵列实验产生的数据与传统的数据结构有很大的不同。在统计中,典型数据集的观测多于特征。但是,典型的微阵列数据集具有比观察结果更多的功能。因此,需要用于分析此类数据的方法。为此,设计了一种新的统计数据,即组内比例(IGP)。本文介绍了它的性质,描述了它的性质,并提出了两种使用组内比例的聚类分析方法。尽管两种方法都可以应用于低维或高维数据,但它们是专门为微阵列数据集设计的。一种方法是估计数据集中存在的簇数。另一种方法是使用独立的数据集对一个数据集中发现的聚类进行统计验证。当应用于模拟和真实数据集时,这两种方法都被证明是有效的。此外,将后一种方法应用于广泛的cDNA微阵列数据集,以发现并统计学验证三种乳腺癌亚型,这些亚型随后被证明在生物学上也是有效的。这些方法和结果有些粗糙。它们旨在作为更复杂的程序和分析所基于的起点。尽管如此,正如本论文所示,组内比例和这些通常适用的方法仍可以发挥重要作用,尤其是在开发更严格的理论统计和检验之前。

著录项

  • 作者

    Kapp, Amy Virginia.;

  • 作者单位

    Stanford University.;

  • 授予单位 Stanford University.;
  • 学科 Biology Biostatistics.; Statistics.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 182 p.
  • 总页数 182
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物数学方法 ; 统计学 ;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号