首页> 外文期刊>Bioinformatics >A sequential Monte Carlo EM approach to the transcription factor binding site identification problem
【24h】

A sequential Monte Carlo EM approach to the transcription factor binding site identification problem

机译:顺序蒙特卡洛EM方法解决转录因子结合位点识别问题

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: A significant and stubbornly intractable problem in genome sequence analysis has been the de novo identification of transcription factor binding sites in promoter regions. Although theoretically pleasing, probabilistic methods have faced difficulties due to model mismatch and the nature of the biological sequence. These problems result in inference in a high dimensional, highly multimodal space, and consequently often display only local convergence and hence unsatisfactory performance. Algorithm: In this article, we derive and demonstrate a novel method utilizing a sequential Monte Carlo-based expectation-maximization (EM) optimization to improve performance in this scenario. The Monte Carlo element should increase the robustness of the algorithm compared to classical EM. Furthermore, the parallel nature of the sequential Monte Carlo algorithm should be more robust than Gibbs sampling approaches to multimodality problems. Results: We demonstrate the superior perfotmance of this algorithm on both semi-synthetic and real data from Escherichia coli.
机译:动机:基因组序列分析中一个重要且顽固的难题是从头鉴定启动子区域中的转录因子结合位点。尽管从理论上讲令人愉悦,但由于模型不匹配和生物学序列的性质,概率方法仍面临困难。这些问题导致在高维,高多模态空间中进行推断,因此通常仅显示局部收敛,因此性能不理想。算法:在本文中,我们推导并演示了一种新颖的方法,该方法利用基于蒙特卡洛的序列期望最大化(EM)优化来改善这种情况下的性能。与经典EM相比,蒙特卡洛元素应提高算法的鲁棒性。此外,顺序蒙特卡洛算法的并行性质应比Gibbs采样方法对多模态问题更健壮。结果:我们证明了该算法在大肠杆菌的半合成和真实数据上均表现出色。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号