首页> 外文学位 >Bayesian model-based methods for the analysis of DNA microarrays with survival, genetic, and sequence data.
【24h】

Bayesian model-based methods for the analysis of DNA microarrays with survival, genetic, and sequence data.

机译:基于贝叶斯模型的具有生存,遗传和序列数据的DNA微阵列分析方法。

获取原文
获取原文并翻译 | 示例

摘要

DNA microarrays measure the expression of thousands of genes or DNA fragments simultaneously in which probes have specific complementary hybridization. Gene expression or microarray data analysis problems have a prominent role in the biostatistics, biological sciences, and clinical medicine. The first paper proposes a method for finding associations between the survival time of the subjects and the gene expression of tumor microarrays. Measurement error is known to bias the estimates for survival regression coefficients, and this method minimizes bias. The latent variable model is shown to detect associations between potentially important genes and survival in a breast cancer dataset that conventional models did not detect, and the method is demonstrated to have robustness to misspecification with simulated data. The second paper considers the Expression Quantitative Trait Loci (eQTL) detection problem. An eQTL is a genetic locus that influences gene expression, and the major challenges with this type of data are multiple testing and computational issues. The proposed method extends the Mixture Over Marker (MOM) model to include a structured prior probability that accounts for the transcript location. The new technique exploits the fact that genetic markers are more likely to influence transcripts that share the same location on the genome. The third paper improves the analysis of Chromatin (Ch)-Immunoprecipitation (IP) (ChIP) microarray data. ChIP-chip data analysis estimates the motif of specific Transcription Factor Binding Sites (TFBSs) by comparing the IP DNA sample that is enriched for the TFBS and a control sample of general genomic DNA. The probes on the ChIP-chip array are uniformly spaced on the genome, and the probes that have relatively high intensity in the IP sample will have corresponding sequences that are likely to contain the TFBS motif. Present analytical methods use the array data to discover peaks or regions of IP enrichment then analyze the sequences of these peaks in a separate procedure to discover the motif. The proposed model will integrate enrichment peak finding and motif discovery through a Hidden Markov Model (HMM). Performance comparisons are made between the proposed HMM and the previously developed methods.
机译:DNA微阵列同时测量数千种基因或DNA片段的表达,其中探针具有特异性互补杂交。基因表达或微阵列数据分析问题在生物统计学,生物科学和临床医学中具有重要作用。第一篇论文提出了一种寻找受试者生存时间与肿瘤微阵列基因表达之间关联的方法。已知测量误差会使生存回归系数的估计值产生偏差,并且此方法可最大程度地减少偏差。潜在变量模型显示可检测到常规模型无法检测到的乳腺癌数据集中潜在重要基因与生存之间的关联,并且该方法被证明具有对模拟数据错误指定的鲁棒性。第二篇论文考虑了表达定量性状位点(eQTL)检测问题。 eQTL是影响基因表达的遗传位点,此类数据的主要挑战是多重测试和计​​算问题。所提出的方法扩展了标记上的混合(MOM)模型,以包括一个用于记录笔录位置的结构化先验概率。新技术利用了这样一个事实,即遗传标记更可能影响在基因组上具有相同位置的转录本。第三篇论文改进了染色质(Ch)-免疫沉淀(IP)(ChIP)微阵列数据的分析。 ChIP芯片数据分析通过比较富含TFBS的IP DNA样品和普通基因组DNA的对照样品来估算特定转录因子结合位点(TFBS)的基序。 ChIP芯片阵列上的探针在基因组上均匀分布,并且IP样本中强度相对较高的探针将具有可能包含TFBS基序的相应序列。当前的分析方法使用阵列数据来发现IP富集的峰或区域,然后在单独的过程中分析这些峰的序列以发现基序。提出的模型将通过隐马尔可夫模型(HMM)整合富集峰发现和基序发现。在提议的HMM和以前开发的方法之间进行性能比较。

著录项

  • 作者

    Gelfond, Jonathan A. L.;

  • 作者单位

    The University of North Carolina at Chapel Hill.$bBiostatistics.;

  • 授予单位 The University of North Carolina at Chapel Hill.$bBiostatistics.;
  • 学科 Biology Biostatistics.; Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 143 p.
  • 总页数 143
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物数学方法;
  • 关键词

  • 入库时间 2022-08-17 11:39:13

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号