首页> 外文学位 >Applying integrative computational models to study the evolution of gene regulation.
【24h】

Applying integrative computational models to study the evolution of gene regulation.

机译:应用综合计算模型研究基因调控的演变。

获取原文
获取原文并翻译 | 示例

摘要

Gene regulatory networks dynamically control the expression levels of all the genes, and are the keys in explaining various phenotypes and biological processes. The advance of high-throughput measurement technology, such as microarray and next-generation sequencing, enabled us to globally scrutinize various cell properties related to gene regulation and build statistical models to make quantitative predictions. The evolutionary process has left all kinds of traces in the current biological systems. The study of the evolution of gene regulatory networks in comparable cell types across species is an efficient method to unravel such evolutionary traces and help us to better understand the regulatory mechanism. The two main themes of my research are: analysing various "omics" data in the evolutionary context to identify conservation and changes in gene regulatory networks; and building computational models to incorporate different "omics" data for the annotation of genomes and prediction of evolution in gene regulation.;The second chapter of my thesis described a computational algorithm for de novo prediction of transcription factor binding site motifs in multiple species. The algorithm, named "GibbsModule", uses three information sources to improve the prediction power, which are 1) co-expressed genes sharing the same set of motifs; 2) binding sites co-localizing to form modules; and 3) the conservation for the use of motifs across species. We developed a Gibbs sampling procedure to incorporate the three information sources. GibbsModule out-performed the existing algorithms on several synthetic and real datasets. When applied to study the binding regions of KLF in embryonic stem cells, GibbsModule discovered a new functional motif. We also used ChIP followed by qPCR to demonstrate that the binding affinity of GibbsModule predicted binding sites are stronger than non-predicted motifs.;Both genome sequence and gene expression carry information about gene regulation. Therefore, we can learn more about gene regulatory networks by jointly analysing sequence and expression data. In the third chapter of my thesis, we first introduced a comparative study of the pre-implantation process of embryos in three mammalian species: human, mouse, and cow. We measured time course expression profiles of the embryos during the early development, and analysed them together with genome sequence data and ChIP-seq data. We observed a large portion of changed homologous gene expression, suggesting a prevalent rewiring of gene regulation. We associated the changes of gene expression with different types of cis-changes on the genome sequences. Especially, we found about 10% of species specific transposons are carrying multiple functional binding sites, which are likely to explain the evolution of gene expression. The second part of this chapter presented a phylogenetic model that incorporated the change of motif use and gene expression to infer the rewiring of gene regulatory networks.;Epi-genetic modifications, including histone modifications and DNA methylation, are known to be associated with gene regulation. In chapter four, we studied the evolution of epi-genomes in pluripotent stem cells of human, mice, and pigs. We observed the conservation of epi-genomes in different categories of genomic regions. We found the evidence of positive and negative selections on the evolution of epi-genomes. Using linear regression models, the evolution of epi-genomes can largely explain the evolution of gene expression. In the second part of this chapter, we introduced a statistical model to describe the evolution of genomes considering both the DNA sequences and epi-genetic modifications. Based on the evolutionary model, we improved the current alignment algorithm with the information of epi-genetic modification distributions.
机译:基因调控网络可动态控制所有基因的表达水平,是解释各种表型和生物学过程的关键。高通量测量技术(例如微阵列和下一代测序)的发展,使我们能够全面审查与基因调控相关的各种细胞特性,并建立统计模型以进行定量预测。进化过程在当前的生物系统中留下了各种痕迹。研究物种间可比细胞类型中基因调控网络的进化是揭示此类进化痕迹并帮助我们更好地了解调控机制的有效方法。我的研究的两个主要主题是:在进化背景下分析各种“组学”数据,以确定基因调控网络的保守性和变化;论文的第二章描述了一种从头开始预测多种物种转录因子结合位点基序的计算算法。该算法名为“ GibbsModule”,它使用三个信息源来提高预测能力,它们是:1)共表达相同基序集的基因; 2)结合位点共同定位形成模块; 3)保护物种间使用图案。我们开发了一个Gibbs采样程序来合并这三个信息源。 GibbsModule在多个合成和真实数据集上的性能均优于现有算法。当用于研究胚胎干细胞中KLF的结合区域时,GibbsModule发现了一个新的功能性基序。我们还使用ChIP和qPCR证明了GibbsModule预测的结合位点的结合亲和力强于未预测的基序。;基因组序列和基因表达均携带有关基因调控的信息。因此,我们可以通过共同分析序列和表达数据来了解有关基因调控网络的更多信息。在论文的第三章中,我们首先介绍了三种哺乳动物物种:人,小鼠和牛的胚胎植入前过程的比较研究。我们测量了早期发育过程中胚胎的时程表达谱,并将其与基因组序列数据和ChIP-seq数据一起进行了分析。我们观察到很大一部分同源基因表达的改变,表明基因调控普遍存在。我们将基因表达的变化与基因组序列上不同类型的顺式变化联系起来。特别是,我们发现大约10%的物种特异性转座子带有多个功能结合位点,这很可能可以解释基因表达的演变。本章的第二部分介绍了一个系统发育模型,该模型结合了基序使用和基因表达的变化,以推断基因调控网络的重新布线。已知表观遗传修饰,包括组蛋白修饰和DNA甲基化,与基因调控有关。 。在第四章中,我们研究了人,小鼠和猪的多能干细胞中表观基因组的进化。我们观察到在不同类别的基因组区域中表观基因组的保守性。我们发现了表观基因组进化的正面和负面选择的证据。使用线性回归模型,表观基因组的进化可以在很大程度上解释基因表达的进化。在本章的第二部分中,我们介绍了一种统计模型,该模型描述了同时考虑DNA序列和表观遗传修饰的基因组进化。基于进化模型,我们利用表观遗传修饰分布的信息改进了当前的对齐算法。

著录项

  • 作者

    Xie, Dan.;

  • 作者单位

    University of Illinois at Urbana-Champaign.;

  • 授予单位 University of Illinois at Urbana-Champaign.;
  • 学科 Bioinformatics.;Evolution development.;Statistics.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 111 p.
  • 总页数 111
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号