首页> 外文学位 >Preprocessing and barcoding of data from a single microarray.
【24h】

Preprocessing and barcoding of data from a single microarray.

机译:来自单个微阵列的数据的预处理和条形码。

获取原文
获取原文并翻译 | 示例

摘要

The ability to measure gene expression based on a single microarray hybridization is necessary for microarrays to be a useful clinical tool. In its simplest form, this amounts to estimating whether or not each gene is expressed in a given sample. Surprisingly, this problem is quite challenging and has been given relatively little attention for the most part in favor of estimating relative expression.;We propose addressing this problem in three steps. First, we develop a method of assessing the performance of microarray preprocessing methods independent of the microarray technology used. Second, we develop a preprocessing algorithm, frozen RMA (fRMA), that allows one to analyze microarrays individually. Specifically, estimates of probe-specific effects and variances are precomputed and frozen. Then, with new data sets, these are used in concert with information from the new array(s) to normalize and summarize the data. Third, we purpose using the distribution of log2 gene intensities across a wide variety of tissues to estimate an expressed and an unexpressed distribution for each gene, and then for each gene in a sample, denoting it as expressed if its log2 gene intensity is more likely under the expressed distribution than under the unexpressed distribution and as unexpressed otherwise. The output of this algorithm is a vector of ones and zeros denoting which genes are estimated to be expressed (ones) and unexpressed (zeros). We call this a gene expression barcode.
机译:基于单个微阵列杂交测量基因表达的能力对于使微阵列成为有用的临床工具是必需的。以其最简单的形式,这相当于估计每个基因是否在给定样品中表达。出人意料的是,这个问题颇具挑战性,并且在很大程度上没有得到足够的重视,而是倾向于估计相对表达。我们建议分三步解决这个问题。首先,我们开发一种独立于所使用的微阵列技术来评估微阵列预处理方法性能的方法。其次,我们开发了一种预处理算法,即冷冻RMA(fRMA),该算法可让您单独分析微阵列。具体而言,预先计算并冻结探针特异效应和方差的估计值。然后,对于新数据集,这些数据集将与新数组中的信息一起使用,以对数据进行规范化和汇总。第三,我们的目的是利用log2基因强度在各种组织中的分布来估计每个基因的表达和未表达分布,然后评估样品中每个基因的表达,如果它的log2基因强度更可能表示其表达在明示分布情况下要比在非表达分布情况下要高,否则就没有表达。该算法的输出是一个由1和0组成的向量,表示估计要表达的基因(一个)和未表达的基因(零)。我们称其为基因表达条形码。

著录项

  • 作者

    McCall, Matthew N.;

  • 作者单位

    The Johns Hopkins University.;

  • 授予单位 The Johns Hopkins University.;
  • 学科 Biology Biostatistics.;Biology Bioinformatics.;Statistics.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 108 p.
  • 总页数 108
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号