...
首页> 外文期刊>Statistics in medicine >Incorporating prior information via shrinkage: a combined analysis of genome-wide location data and gene expression data.
【24h】

Incorporating prior information via shrinkage: a combined analysis of genome-wide location data and gene expression data.

机译:通过收缩整合先验信息:对全基因组位置数据和基因表达数据的组合分析。

获取原文
获取原文并翻译 | 示例
           

摘要

Transcriptional control is a critical step in regulation of gene expression. Understanding such a control on a genomic level involves deciphering the mechanisms and structures of regulatory programmes and networks. A difficulty arises due to the weak signal and high noise in various sources of data while most current approaches are limited to analysis of a single source of data. A natural alternative is to improve statistical efficiency and power by a combined analysis of multiple sources of data. Here we propose a shrinkage method to combine genome-wide location data and gene expression data to detect the binding sites or target genes of a transcription factor. Specifically, a prior 'non-target' gene list is generated by analysing the expression data, and then this information is incorporated into the subsequent binding data analysis via a shrinkage method. There is a Bayesian justification for this shrinkage method. Both simulated and real data were used to evaluate the proposed method and compare itwith analysing binding data alone. In simulation studies, the proposed method gives higher sensitivity and lower false discovery rate (FDR) in detecting the target genes. In real data example, the proposed method can reduce the estimated FDR and increase the power to detect the previously known target genes of a broad transcription regulator, leucine responsive regulatory protein (Lrp) in Escherichia coli. This method can also be used to incorporate other information, such as gene ontology (GO), to microarray data analysis to detect differentially expressed genes.
机译:转录控制是调节基因表达的关键步骤。在基因组水平上理解这种控制涉及破译监管计划和网络的机制和结构。由于各种数据源中的信号弱和噪声高,因此会出现困难,而大多数当前方法仅限于分析单个数据源。一种自然的选择是通过对多个数据源的组合分析来提高统计效率和功效。在这里,我们提出了一种收缩方法,将全基因组范围的位置数据和基因表达数据相结合,以检测转录因子的结合位点或靶基因。具体而言,通过分析表达数据来生成先前的“非靶标”基因列表,然后将该信息通过收缩方法并入后续的结合数据分析中。这种收缩方法有贝叶斯合理性。仿真数据和真实数据均用于评估该方法,并将其与单独分析绑定数据进行比较。在仿真研究中,该方法在检测目标基因时具有更高的灵敏度和更低的误发现率(FDR)。在真实的数据示例中,提出的方法可以减少估计的FDR并提高检测大肠杆菌中广泛的转录调节因子,即亮氨酸反应性调节蛋白(Lrp)的先前已知目标基因的能力。此方法还可用于将其他信息(例如基因本体论(GO))整合到微阵列数据分析中,以检测差异表达的基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号