首页> 外文学位 >Improving the specificity of biological signal detection from microarray data.
【24h】

Improving the specificity of biological signal detection from microarray data.

机译:从微阵列数据提高生物信号检测的特异性。

获取原文
获取原文并翻译 | 示例

摘要

Microarray analysis allows for genome-level exploration of gene expression by taking a snapshot of the cell at a specific point in time. Such datasets may provide insight into fundamental biological questions as well as address clinical issues such as diagnosis and therapy selection. The resulting data sets are very large and complex, and often suffer from sacrifice of specificity for scale. Sophisticated computational tools are needed for nontrivial, highly accurate, and consistent biological interpretation of microarray data.; This dissertation addresses the issue of improving the specificity of biological signal detection from microarray data. I address this problem on three levels. First, I developed two robust and accurate algorithms for missing value estimation for microarray data, KNNimpute and SVDimpute. The algorithms perform overwhelmingly better than row averaging or zero filling methods, and KNNimpute is robust to the choice of parameters used, percent of values missing, and type of data. Second, I created MAGIC, a flexible probabilistic framework for gene function prediction based on integrated analysis of high-throughput biological data, including gene expression data and protein-protein interactions data. I applied MAGIC to S. cerevisiae data and showed that it improves the specificity of gene grouping compared to its input microarray-based clustering methods. Finally, I suggested and evaluated methods for identification of differentially expressed genes and propose a general procedure for evaluation of other biomarker identification methods.
机译:微阵列分析允许通过在特定时间点拍摄细胞快照来进行基因表达的基因组级探索。这样的数据集可以提供对基本生物学问题的洞察力,以及解决诸如诊断和治疗选择等临床问题。所得的数据集非常大且复杂,并且经常牺牲针对规模的特异性。需要精密的计算工具来对微阵列数据进行非平凡,高度准确和一致的生物学解释。本论文解决了从微阵列数据中提高生物信号检测特异性的问题。我从三个层面解决这个问题。首先,我为微阵列数据的缺失值估计开发了两种健壮且准确的算法,即KNNimpute和SVDimpute。该算法的性能比行平均或零填充方法要好得多,并且KNNimpute在选择使用的参数,丢失的百分比值和数据类型方面非常强大。其次,我创建了MAGIC,这是一种基于高通量生物学数据(包括基因表达数据和蛋白质-蛋白质相互作用数据)的综合分析,用于基因功能预测的灵活概率框架。我将MAGIC应用于 S。数据,并显示与基于输入微阵列的聚类方法相比,它提高了基因分组的特异性。最后,我提出并评估了鉴定差异表达基因的方法,并提出了评估其他生物标记物鉴定方法的通用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号