首页> 外文会议>International Conference on Advances in Electrical Engineering >Motif discovery in unaligned DNA sequences using genetic algorithm
【24h】

Motif discovery in unaligned DNA sequences using genetic algorithm

机译:使用遗传算法在未比对的DNA序列中发现基序

获取原文

摘要

Motif discovery in unaligned DNA sequences has become a challenging problem in computer science and molecular biology. Finding a cluster of numerous similar subsequences in a set of biopolymer sequences is evidence that the subsequences occur not by chance but because they share some biological function. Motifs can be used to determine evolutionary and functional relationships. Over the past few decades, many motif discovery algorithms have been designed and developed into tools that become available to public. In this paper, we represent an algorithm on motif discovery developed using Genetic Algorithm (GA). In our approach, we search for potential Motifs from a group of DNA sequences of transcription start site (TSS). The Genetic operations such as mutation, crossover is performed with the help of position weight matrix generated from a set of matched sequences. A rearrangement method is used to reduce the chances of a local stable motif being selected over a global stable motif. A preprocessing function is used to relate randomly generated initial motifs with the promoter sequences and a discursion function is used to minimize the computational time. We evaluated our result based on a fitness score and occurrence frequency of a candidate motif in a group of promoter sequence. Our approach gives better result than Finding Motif by Genetic Algorithm (FMGA) which itself showed superior result with comparison to two other Motif finding algorithm namely Multiple Em for motif Elicitation (MEME) and Gibbs Sampler.
机译:在未比对的DNA序列中发现基序已成为计算机科学和分子生物学中一个具有挑战性的问题。在一组生物聚合物序列中找到许多相似子序列的簇是证据,表明这些子序列不是偶然发生的,而是因为它们具有某些生物学功能。母题可用于确定进化和功能的关系。在过去的几十年中,许多主题发现算法已被设计并开发为可供公众使用的工具。在本文中,我们代表了一种使用遗传算法(GA)开发的主题发现算法。在我们的方法中,我们从一组转录起始位点(TSS)的DNA序列中搜索潜在的基序。遗传操作(例如突变,交叉)是借助从一组匹配序列中生成的位置权重矩阵执行的。使用重排方法来减少选择局部稳定基元而不是全局稳定基元的机会。预处理函数用于将随机生成的初始基序与启动子序列相关联,而离散函数用于最小化计算时间。我们基于适合度评分和一组启动子序列中候选基序的出现频率评估了我们的结果。与通过遗传算法查找主题(FMGA)相比,我们的方法给出了更好的结果,FMGA本身显示出比其他两种主题查找算法即用于主题抽取的多重Em(MEME)和Gibbs Sampler更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号