Motif discovery in unaligned DNA sequences using genetic algorithm

机译：使用遗传算法在未比对的DNA序列中发现基序

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Motif discovery in unaligned DNA sequences has become a challenging problem in computer science and molecular biology. Finding a cluster of numerous similar subsequences in a set of biopolymer sequences is evidence that the subsequences occur not by chance but because they share some biological function. Motifs can be used to determine evolutionary and functional relationships. Over the past few decades, many motif discovery algorithms have been designed and developed into tools that become available to public. In this paper, we represent an algorithm on motif discovery developed using Genetic Algorithm (GA). In our approach, we search for potential Motifs from a group of DNA sequences of transcription start site (TSS). The Genetic operations such as mutation, crossover is performed with the help of position weight matrix generated from a set of matched sequences. A rearrangement method is used to reduce the chances of a local stable motif being selected over a global stable motif. A preprocessing function is used to relate randomly generated initial motifs with the promoter sequences and a discursion function is used to minimize the computational time. We evaluated our result based on a fitness score and occurrence frequency of a candidate motif in a group of promoter sequence. Our approach gives better result than Finding Motif by Genetic Algorithm (FMGA) which itself showed superior result with comparison to two other Motif finding algorithm namely Multiple Em for motif Elicitation (MEME) and Gibbs Sampler.

机译：在未比对的DNA序列中发现基序已成为计算机科学和分子生物学中一个具有挑战性的问题。在一组生物聚合物序列中找到许多相似子序列的簇是证据，表明这些子序列不是偶然发生的，而是因为它们具有某些生物学功能。母题可用于确定进化和功能的关系。在过去的几十年中，许多主题发现算法已被设计并开发为可供公众使用的工具。在本文中，我们代表了一种使用遗传算法（GA）开发的主题发现算法。在我们的方法中，我们从一组转录起始位点（TSS）的DNA序列中搜索潜在的基序。遗传操作（例如突变，交叉）是借助从一组匹配序列中生成的位置权重矩阵执行的。使用重排方法来减少选择局部稳定基元而不是全局稳定基元的机会。预处理函数用于将随机生成的初始基序与启动子序列相关联，而离散函数用于最小化计算时间。我们基于适合度评分和一组启动子序列中候选基序的出现频率评估了我们的结果。与通过遗传算法查找主题（FMGA）相比，我们的方法给出了更好的结果，FMGA本身显示出比其他两种主题查找算法即用于主题抽取的多重Em（MEME）和Gibbs Sampler更好的结果。

著录项

来源
《International Conference on Advances in Electrical Engineering》|2017年|725-730|共6页
会议地点
作者
Al Muttakin; Mohammad Rezwanul Huq;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
DNA; Prediction algorithms; Genetic algorithms; Algorithm design and analysis; Computer architecture; Pattern matching;

机译：DNA;预测算法;遗传算法;算法设计与分析;计算机体系结构;模式匹配;

相似文献

外文文献
中文文献
专利

1. Genetic Algorithm Based Probabilistic Motif Discovery in Unaligned Biological Sequences | Science Publications [J] . K. Vivekanandan, M. Hemalatha Journal of computer sciences . 2008,第8期

机译：未对齐生物序列中基于遗传算法的概率基元发现科学出版物
2. Motif discoveries in unaligned molecular sequences using self-organizing neural networks [J] . Derong Liu, Xiaoxu Xiong, DasGupta B., IEEE Transactions on Neural Networks . 2006,第4期

机译：使用自组织神经网络在未比对的分子序列中进行基序发现
3. Multiobjective optimization algorithms for motif discovery in DNA sequences [J] . Gonzalez-Alvarez David L., Vega-Rodriguez Miguel A., Rubio-Largo Alvaro Genetic programming and evolvable machines . 2015,第2期

机译：DNA序列中基序发现的多目标优化算法
4. Motif discovery in unaligned DNA sequences using genetic algorithm [C] . Al Muttakin, Mohammad Rezwanul Huq International Conference on Advances in Electrical Engineering . 2017

机译：使用遗传算法未对准DNA序列中的基序发现
5. Novel algorithms for motif discovery in bio-sequence datasets. [D] . Balla, Sudha. 2007

机译：用于生物序列数据集中的基序发现的新算法。
6. RNAProfile: an algorithm for finding conserved secondary structure motifs in unaligned RNA sequences [O] . Giulio Pavesi, Giancarlo Mauri, Marco Stefani, 2004

机译：RNAProfile：一种在未比对的RNA序列中寻找保守二级结构基序的算法
7. Genetic Algorithm Based Probabilistic Motif Discovery in Unaligned Biological Sequences [O] . M. Hemalatha, K. Vivekanandan 2008

机译：不匹配生物序列中基于遗传算法的概率基元发现

Motif discovery in unaligned DNA sequences using genetic algorithm

摘要

著录项

相似文献

相关主题

期刊订阅