Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis

Yaw-Ling Lin; Tao Jiang; Kun-Mao Chao

首页> 外文期刊>Journal of computer and system sciences >Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis

【24h】

Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis

机译：定位受长度限制的最重链段的高效算法及其在生物分子序列分析中的应用

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study two fundamental problems concerning the search for interesting regions in sequences: (ⅰ) given a sequence of real numbers of length n and an upper bound U, find a consecutive subsequence of length at most U with the maximum sum and (ⅱ) given a sequence of real numbers of length n and a lower bound L, find a consecutive subsequence of length at least L with the maximum average. We present an O(n)-time algorithm for the first problem and an O(n log L)-time algorithm for the second. The algorithms have potential applications in several areas of biomolecular sequence analysis including locating GC-rich regions in a genomic DNA sequence, post-processing sequence alignments, annotating multiple sequence alignments, and computing length-constrained ungapped local alignment. Our preliminary tests on both simulated and real data demonstrate that the algorithms are very efficient and able to locate useful (such as GC-rich) regions.

机译：我们研究了有关在序列中搜索感兴趣区域的两个基本问题：（ⅰ）给定了一个长度为n的实数序列和一个上限U，找到了一个长度最大为U的连续子序列，且最大和为（ⅱ）长度为n和下限为L的实数序列，找到长度至少为L且具有最大平均值的连续子序列。我们为第一个问题提出O（n）时间算法，为第二个问题提出O（n log L）时间算法。该算法在生物分子序列分析的多个领域中具有潜在的应用，包括在基因组DNA序列中定位富含GC的区域，后处理序列比对，注释多个序列比对以及计算长度受限的无缺口局部比对。我们对模拟和真实数据的初步测试表明，该算法非常有效，并且能够找到有用的（例如富含GC的）区域。

著录项

来源
《Journal of computer and system sciences》 |2002年第3期|p.570-586|共17页
作者
Yaw-Ling Lin; Tao Jiang; Kun-Mao Chao;
展开▼
作者单位

Department of Computer Science and Information Management, Providence University, 200 Chung Chi Road, Shalu, Taichung County, 433 Taiwan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
algorithm; efficiency; maximum consecutive subsequence; length constraint; biomolecular sequence analysis; ungapped local alignment;

机译：算法;效率;最大连续子序列;长度约束;生物分子序列分析;非局部对齐;

相似文献

外文文献
中文文献
专利

1. An efficient algorithm for the length-constrained heaviest path problem on a tree [J] . Bang ye Wu, Kun-Mao Chao, Chuan Yi Tang Information Processing Letters . 1999,第2期

机译：树上受长度限制的最重路径问题的有效算法
2. Algorithms for finding the weight-constrained k longest paths in a tree and the length-constrained k maximum-sum segments of a sequence [J] . Hsiao-Fei Liu, Kun-Mao Chao Theoretical computer science . 2008,第1a3期

机译：查找树中权重受限的k条最长路径和序列的长度受限的k条最大和算法
3. Using spine decompositions to efficiently solve the length-constrained heaviest path problem for trees [J] . Bishnu Bhattacharyya, Frank Dehne Information Processing Letters . 2008,第5期

机译：使用脊柱分解有效解决树木的长度受限最重路径问题
4. Efficient Algorithms for Locating the Length-Constrained Heaviest Segments, with Applications to Biomolecular Sequence Analysis [C] . Yaw-Ling Lin, Tao Jiang, Kun-Mao Chao International symposium on mathematical foundtions of computer science . 2002

机译：用于定位长度约束最重的段的高效算法，具有生物分子序列分析的应用
5. Efficient algorithms for large data sets of genomic sequences in microbial community analysis. [D] . Knox, David A. 2010

机译：微生物群落分析中基因组序列大数据集的高效算法。
6. Applications of parallel processing algorithms for DNA sequence analysis. [O] . J F Collins, A F Coulson 1984

机译：并行处理算法在DNA序列分析中的应用。
7. Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis [O] . Yaw-ling Lin, Tao Jiang, Kun-mao Chao 2002

机译：定位长度受限制的最重链段的高效算法及其在生物分子序列分析中的应用
8. Efficient algorithms and data structures in support of DNA mapping and sequence analysis. Progress report, February 1991--February 1992 [R] . Gusfield, D, Lawler, EL, Balasubramanian, K, 1992

机译：高效的算法和数据结构，支持DNa映射和序列分析。进展报告，1991年2月至1992年2月

Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis

摘要

著录项

相似文献

相关主题

期刊订阅