首页> 外文学位 >Statistical analysis and meta-analysis of microarray data.
【24h】

Statistical analysis and meta-analysis of microarray data.

机译:微阵列数据的统计分析和荟萃分析。

获取原文
获取原文并翻译 | 示例

摘要

The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data.; The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior.; The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful.
机译:微阵列技术提供了一种研究基因表达的高通量技术。微阵列可以帮助我们诊断不同类型的癌症,了解生物学过程,评估宿主对药物和病原体的反应,找到特定疾病的标志物等等。微阵列实验产生大量数据。因此,有效的数据处理和分析对于从数据进行可靠的推断至关重要。论文的第一部分解决了寻找最佳基因集(生物标志物)以将一组样本分类为疾病或正常样本的问题。开发了三种统计基因选择方法(GS,GS-NR和GS-PCA)来鉴定出一组最佳区分样品的基因。对不同分类工具进行了比较研究,并确定了用于多类癌症分类的基因选择和分类器的最佳组合。对于大多数基准癌症数据集,本文提出的基因选择方法GS优于其他基因选择方法。基于随机森林,神经网络集成和K近邻(KNN)的分类器始终表现出出色的性能。这些分类器之间的一个惊人的共性是它们都使用基于委员会的方法,这表明整体分类方法是更好的。可以在不同的研究实验室研究和/或使用不同的实验室规程或样品进行同一生物学问题。在这种情况下,重要的是要合并这些努力的结果。论文的第二部分解决了合并来自不同独立实验的结果以获得改进结果的问题。本文研究了四种统计合并技术(Fisher反卡方方法,Logit方法,Stouffer's Z变换方法和Liptak-Stouffer加权Z方法)。这些合并技术被应用于在两种不同酵母物种中鉴定细胞周期调控基因的问题。结果,鉴定出改善的细胞周期调控基因集。论文的最后一部分探讨了小波数据变换对聚类任务的有效性。离散小波变换,以及适当选择的小波基,被证明可以有效地产生在生物学上更有意义的簇。

著录项

  • 作者

    Zheng, Gaolin.;

  • 作者单位

    Florida International University.;

  • 授予单位 Florida International University.;
  • 学科 Biology Bioinformatics.; Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 98 p.
  • 总页数 98
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号