首页> 外文期刊>Bioinformatics >HIGEDA: a hierarchical gene-set genetics based algorithm for finding subtle motifs in biological sequences
【24h】

HIGEDA: a hierarchical gene-set genetics based algorithm for finding subtle motifs in biological sequences

机译:HIGEDA:一种基于层次基因集遗传学的算法,可在生物序列中发现微妙的图案

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: Identification of motifs in biological sequences is a challenging problem because such motifs are often short, degenerate, and may contain gaps. Most algorithms that have been developed for motif-finding use the expectation-maximization (EM) algorithm iteratively. Although EM algorithms can converge quickly, they depend strongly on initialization parameters and can converge to local sub-optimal solutions. In addition, they cannot generate gapped motifs. The effectiveness of EM algorithms in motif finding can be improved by incorporating methods that choose different sets of initial parameters to enable escape from local optima, and that allow gapped alignments within motif models.Results: We have developed HIGEDA, an algorithm that uses the hierarchical gene-set genetic algorithm (HGA) with EM to initiate and search for the best parameters for the motif model. In addition, HIGEDA can identify gapped motifs using a position weight matrix and dynamic programming to generate an optimal gapped alignment of the motif model with sequences from the dataset. We show that HIGEDA outperforms MEME and other motif-finding algorithms on both DNA and protein sequences.
机译:动机:在生物序列中鉴定基序是一个具有挑战性的问题,因为此类基序通常较短,简并且可能包含缺口。已经开发出用于主题查找的大多数算法,都反复使用期望最大化(EM)算法。尽管EM算法可以快速收敛,但它们在很大程度上取决于初始化参数,并且可以收敛于局部次优解决方案。另外,它们不能产生缺口基序。 EM算法在动机发现中的有效性可以通过结合以下方法来提高:选择不同的初始参数集以逃避局部最优值,并允许动机模型内的空位比对。结果:我们开发了HIGEDA,这是一种使用分层具有EM的基因集遗传算法(HGA),可以启动和搜索模体模型的最佳参数。此外,HIGEDA可以使用位置权重矩阵和动态编程来识别缺口基序,以生成基序模型与数据集中的序列的最佳缺口比对。我们显示,HIGEDA在DNA和蛋白质序列上均胜过MEME和其他主题发现算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号