首页> 外文期刊>Bioinformatics >TFBS identification based on genetic algorithm with combined representations and adaptive post-processing
【24h】

TFBS identification based on genetic algorithm with combined representations and adaptive post-processing

机译:基于遗传算法的组合表示与自适应后处理的TFBS识别

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

MOTIVATION: Identification of transcription factor binding sites (TFBSs) plays an important role in deciphering the mechanisms of gene regulation. Recently, GAME, a Genetic Algorithm (GA)-based approach with iterative post-processing, has shown superior performance in TFBS identification. However, the basic GA in GAME is not elaborately designed, and may be trapped in local optima in real problems. The feature operators are only applied in the post-processing, but the final performance heavily depends on the GA output. Hence, both effectiveness and efficiency of the overall algorithm can be improved by introducing more advanced representations and novel operators in the GA, as well as designing the post-processing in an adaptive way. RESULTS: We propose a novel framework GALF-P, consisting of Genetic Algorithm with Local Filtering (GALF) and adaptive post-processing techniques (-P), to achieve both effectiveness and efficiency for TFBS identification. GALF combines the position-led and consensus-led representations used separately in current GAs and employs a novel local filtering operator to get rid of false positives within an individual efficiently during the evolutionary process in the GA. Pre-selection is used to maintain diversity and avoid local optima. Post-processing with adaptive adding and removing is developed to handle general cases with arbitrary numbers of instances per sequence. GALF-P shows superior performance to GAME, MEME, BioProspector and BioOptimizer on synthetic datasets with difficult scenarios and real test datasets. GALF-P is also more robust and reliable when further compared with GAME, the current state-of-the-art approach. AVAILABILITY: http://www.cse.cuhk.edu.hk/~tmchan/GALFP/.
机译:动机:转录因子结合位点(TFBSs)的识别在破译基因调控机制中起着重要作用。最近,具有迭代后处理功能的基于遗传算法(GA)的方法GAME在TFBS识别中表现出卓越的性能。但是,GAME中的基本GA并未经过精心设计,可能会在实际问题中陷入局部最优状态。特征运算符仅应用于后处理,但最终性能在很大程度上取决于GA输出。因此,可以通过在GA中引入更高级的表示形式和新颖的运算符,以及以自适应方式设计后处理,来提高整体算法的有效性和效率。结果:我们提出了一种新颖的框架GALF-P,该框架由具有局部滤波的遗传算法(GALF)和自适应后处理技术(-P)组成,以实现TFBS识别的有效性和效率。 GALF结合了当前GA中分别使用的位置导向和共识导向表示形式,并采用了新颖的局部过滤算子在GA的进化过程中有效地消除了个体内的误报。预选择用于维持多样性并避免局部最优。开发了具有自适应添加和删除功能的后处理程序,以处理每个序列具有任意数量实例的一般情况。 GALF-P在具有困难场景和实际测试数据集的合成数据集上表现出优于GAME,MEME,BioProspector和BioOptimizer的性能。与当前的最新技术GAME相比,GALF-P更加强大和可靠。可用性:http://www.cse.cuhk.edu.hk/~tmchan/GALFP/。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号