首页> 外文期刊>NanoBioscience, IEEE Transactions on >Probabilistic Inference on Multiple Normalized Genome-Wide Signal Profiles With Model Regularization
【24h】

Probabilistic Inference on Multiple Normalized Genome-Wide Signal Profiles With Model Regularization

机译:具有模型正则化的多个归一化基因组范围内信号谱的概率推断

获取原文
获取原文并翻译 | 示例
           

摘要

Understanding genome-wide protein-DNA interaction signals forms the basis for further focused studies in gene regulation. In particular, the chromatin immunoprecipitation with massively parallel DNA sequencing technology (ChIP-Seq) can enable us to measure the in vivo genome-wide occupancy of the DNA-binding protein of interest in a single run. Multiple ChIP-Seq runs thus inherent the potential for us to decipher the combinatorial occupancies of multiple DNA-binding proteins. To handle the genome-wide signal profiles from those multiple runs, we propose to integrate regularized regression functions (i.e., LASSO, Elastic Net, and Ridge Regression) into the well-established SignalRanker and FullSignalRanker frameworks, resulting in six additional probabilistic models for inference on multiple normalized genome-wide signal profiles. The corresponding model training algorithms are devised with computational complexity analysis. Comprehensive benchmarking is conducted to demonstrate and compare the performance of nine related probabilistic models on the ENCODE ChIP-Seq datasets. The results indicate that the regularized SignalRanker models, in contrast to the original SignalRanker models, can demonstrate excellent inference performance comparable to the FullSignalRanker models with low model complexities and time complexities. Such a feature is especially valuable in the context of the rapidly growing genome-wide signal profile data in the recent years.
机译:了解全基因组蛋白质-DNA相互作用信号构成了基因调控中进一步重点研究的基础。特别是,使用大规模并行DNA测序技术(ChIP-Seq)进行的染色质免疫沉淀可以使我们在一次运行中测量目标DNA结合蛋白在体内全基因组的占有率。因此,多个ChIP-Seq运行固有的潜力,使我们能够解密多个DNA结合蛋白的组合占据。为了处理来自这些多次运行的全基因组信号概况,我们建议将正规化的回归函数(即LASSO,Elastic Net和Ridge回归)集成到完善的SignalRanker和FullSignalRanker框架中,从而产生六个额外的概率模型进行推理在多个标准化的全基因组信号谱中。设计了相应的模型训练算法,并进行了计算复杂度分析。进行了全面的基准测试,以演示和比较ENCODE ChIP-Seq数据集上九个相关概率模型的性能。结果表明,与原始SignalRanker模型相比,正规化SignalRanker模型具有较低的模型复杂度和时间复杂度,可与FullSignalRanker模型相比表现出出色的推理性能。在近年来快速增长的全基因组信号概况数据的背景下,这种功能特别有价值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号