DNA Sequence Compression Using Adaptive Particle Swarm Optimization-Based Memetic Algorithm

Zhu Z.; Zhou J.; Ji Z.; Shi Y.-H.

首页> 外文期刊>Evolutionary Computation, IEEE Transactions on >DNA Sequence Compression Using Adaptive Particle Swarm Optimization-Based Memetic Algorithm

【24h】

DNA Sequence Compression Using Adaptive Particle Swarm Optimization-Based Memetic Algorithm

机译：基于自适应粒子群优化的模因算法进行DNA序列压缩

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

AI期刊论文写作 >>

页面导航

摘要
著录项
引文网络
相似文献

摘要

With the rapid development of high-throughput DNA sequencing technologies, the amount of DNA sequence data is accumulating exponentially. The huge influx of data creates new challenges for storage and transmission. This paper proposes a novel adaptive particle swarm optimization-based memetic algorithm (POMA) for DNA sequence compression. POMA is a synergy of comprehensive learning particle swarm optimization (CLPSO) and an adaptive intelligent single particle optimizer (AdpISPO)-based local search. It takes advantage of both CLPSO and AdpISPO to optimize the design of approximate repeat vector (ARV) codebook for DNA sequence compression. ARV is first introduced in this paper to represent the repeated fragments across multiple sequences in direct, mirror, pairing, and inverted patterns. In POMA, candidate ARV codebooks are encoded as particles and the optimal solution, which covers the most approximate repeated fragments with the fewest base variations, is identified through the exploration and exploitation of POMA. In each iteration of POMA, the leader particles in the swarm are selected based on weighted fitness values and each leader particle is fine-tuned with an AdpISPO-based local search, so that the convergence of the search in local region is accelerated. A detailed comparison study between POMA and the counterpart algorithms is performed on 29 (23 basic and 6 composite) benchmark functions and 11 real DNA sequences. POMA is observed to obtain better or competitive performance with a limited number of function evaluations. POMA also attains lower bits-per-base than other state-of-the-art DNA-specific algorithms on DNA sequence data. The experimental results suggest that the cooperation of CLPSO and AdpISPO in the framework of memetic algorithm is capable of searching the ARV codebook space efficiently.

机译：随着高通量DNA测序技术的飞速发展，DNA序列数据的数量呈指数增长。大量数据涌入给存储和传输带来了新的挑战。本文提出了一种新的基于自适应粒子群优化的模因算法（POMA）的DNA序列压缩。 POMA是全面学习粒子群优化（CLPSO）和基于自适应智能单粒子优化器（AdpISPO）的本地搜索的协同作用。它利用CLPSO和AdpISPO的优势来优化DNA序列压缩的近似重复载体（ARV）码本的设计。本文首先介绍了ARV，以直接，镜像，配对和反向模式表示多个序列中的重复片段。在POMA中，候选ARV码本被编码为粒子，并通过对POMA的探索和开发来确定最佳解决方案，该解决方案涵盖了最近似的重复片段且基变最少。在POMA的每次迭代中，都基于加权的适应度值选择群体中的前导粒子，并使用基于AdpISPO的局部搜索对每个前导粒子进行微调，从而加快了局部搜索的收敛速度。在29个（23个基本和6个复合）基准函数和11个真实DNA序列上进行了POMA和对应算法之间的详细比较研究。观察到POMA可以通过有限的功能评估获得更好的性能或具有竞争力的性能。与其他最新的DNA序列数据特有的DNA特定算法相比，POMA的每基位数也更低。实验结果表明，在模因算法框架下，CLPSO和AdpISPO的协作能够有效地搜索ARV码本空间。

著录项

来源
《Evolutionary Computation, IEEE Transactions on》 |2011年第5期|p.643-658|共16页
作者
Zhu Z.; Zhou J.; Ji Z.; Shi Y.-H.;
展开▼
作者单位

Shenzhen City Key Laboratory of Embedded System Design, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Approximate repeat vector; DNA sequence compression; memetic algorithm; particle swarm optimization;

机译：近似重复向量;DNA序列压缩;模拟算法;粒子群优化;