...
首页> 外文期刊>Artificial intelligence in medicine >Memetic algorithms for de novo motif-finding in biomedical sequences
【24h】

Memetic algorithms for de novo motif-finding in biomedical sequences

机译:用于从头开始寻找生物医学序列的模因算法

获取原文
获取原文并翻译 | 示例
           

摘要

Objectives: The objectives of this study are to design and implement a new memetic algorithm for de novo motif discovery, which is then applied to detect important signals hidden in various biomedical molecular sequences. Methods and materials: In this paper, memetic algorithms are developed and tested in de novo motif-finding problems. Several strategies in the algorithm design are employed that are to not only efficiently explore the multiple sequence local alignment space, but also effectively uncover the molecular signals. As a result, there are a number of key features in the implementation of the memetic motif-finding algorithm (MaMotif), including a chromosome replacement operator, a chromosome alteration-aware local search operator, a truncated local search strategy, and a stochastic operation of local search imposed on individual learning. To test the new algorithm, we compare MaMotif with a few of other similar algorithms using simulated and experimental data including genomic DNA, primary microRNA sequences (let-7 family), and transmembrane protein sequences. Results: The new memetic motif-finding algorithm is successfully implemented in C++, and exhaustively tested with various simulated and real biological sequences. In the simulation, it shows that MaMotif is the most time-efficient algorithm compared with others, that is, it runs 2 times faster than the expectation maximization (EM) method and 16 times faster than the genetic algorithm-based EM hybrid. In both simulated and experimental testing, results show that the new algorithm is compared favorably or superior to other algorithms. Notably, MaMotif is able to successfully discover the transcription factors' binding sites in the chromatin immunoprecipitation followed by massively parallel sequencing (ChlP-Seq) data, correctly uncover the RNA splicing signals in gene expression, and precisely find the highly conserved helix motif in the transmembrane protein sequences, as well as rightly detect the palindromic segments in the primary microRNA sequences. Conclusions: The memetic motif-finding algorithm is effectively designed and implemented, and its applications demonstrate it is not only time-efficient, but also exhibits excellent performance while compared with other popular algorithms.
机译:目的:本研究的目的是设计和实现一种新的模因算法,用于从头发现主题,然后将其应用于检测各种生物医学分子序列中隐藏的重要信号。方法和材料:本文研究了模因算法,并从头进行了主题查找问题的测试。算法设计中采用了几种策略,这些策略不仅可以有效地探索多序列局部比对空间,而且可以有效地揭示分子信号。因此,模因主题查找算法(MaMotif)的实现具有许多关键功能,包括染色体替换算子,可感知染色体改变的局部搜索算子,截短的局部搜索策略和随机操作对个人学习的本地搜索数量。为了测试新算法,我们使用模拟和实验数据(包括基因组DNA,一级microRNA序列(let-7家族)和跨膜蛋白序列)将MaMotif与其他几种类似算法进行了比较。结果:新的模因基序查找算法已成功用C ++实现,并已在各种模拟和真实生物序列上进行了详尽的测试。在仿真中,它表明MaMotif是与其他算法相比最省时的算法,即运行速度比期望最大化(EM)方法快2倍,比基于遗传算法的EM混合方法快16倍。在模拟和实验测试中,结果均表明,该新算法与其他算法相比具有优越性或优越性。值得注意的是,MaMotif能够成功地在染色质免疫沉淀中发现转录因子的结合位点,然后进行大规模平行测序(ChlP-Seq)数据,在基因表达中正确地发现RNA剪接信号,并精确地在基因表​​达中找到高度保守的螺旋基序。跨膜蛋白序列,以及正确检测主要microRNA序列中的回文区段。结论:模因主题发现算法得到了有效的设计和实现,其应用表明,与其他流行算法相比,该算法不仅省时,而且性能优良。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号