Faster Algorithms for Sampling and Counting Biological Sequences

机译：更快的生物序列采样和计数算法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A set of sequences S is pairwise bounded if the Hamming distance between any pair of sequences in S is at most 2d. The Consensus Sequence problem aims to discern between pairwise bounded sets that have a consensus, and if so, finding one such sequence s~*, and those that do not. This problem is closely related to the motif-recognition problem, which abstractly models finding important subsequences in biological data. We give an efficient algorithm for sampling pairwise bounded sets, referred to as MarkovSampling, and show it generates pairwise bounded sets uniformly at random. We illustrate the applicability of MarkovSampling to efficiently solving motif-recognition instances. Computing the expected number of motif sets has been a long-standing open problem in motif-recognition [1,3]. We consider the related problem of counting the number of pairwise bounded sets, give new bounds on number of pairwise bounded sets, and present an algorithmic approach to counting the number of pairwise bounded sets.

机译：如果S中任意一对序列之间的汉明距离最多为2d，则一组序列S是成对有界的。共识序列问题旨在区分具有共识的成对有界集，如果是，则找到一个这样的序列s〜*，以及那些没有的序列。这个问题与主题识别问题密切相关，后者是抽象模型，用于寻找生物学数据中的重要子序列。我们给出了一种有效的对成对有界集合进行采样的算法，称为MarkovSampling，并显示了该算法随机均匀地生成成对有界集合。我们说明了MarkovSampling在有效解决主题识别实例方面的适用性。计算主题集的预期数量一直是主题识别中一个长期存在的开放性问题[1,3]。我们考虑了计数成对有界集合的数量的相关问题，对成对有界集合的数量给出了新的界，并提出了一种计算成对有界集合的数量的算法。

著录项

来源
《String processing and information retrieval》|2009年|P.243-253|共11页
会议地点 Saariselka(FI);Saariselka(FI)
作者
Christina Boucher;
展开▼
作者单位

David R. Cheriton School of Computer Science,University of Waterloo;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机网络;
关键词

相似文献

外文文献
中文文献
专利

1. A fast algorithm for exact sequence search in biological sequences using polyphase decomposition. [J] . Srikantha A, Bopardikar AS, Kaipa KK, Bioinformatics . 2010,第18期

机译：一种使用多相分解的生物序列中精确序列搜索的快速算法。
2. A fast algorithm for exact sequence search in biological sequences using polyphase decomposition [J] . Rangavittal Narayanan Bioinformatics . 2010,第18期

机译：使用多相分解的生物序列中精确序列搜索的快速算法
3. A Fast Exact Pattern Matching Algorithm for Biological Sequences [J] . Sanchita Paul, Mangesh K. Rajak, Gadadhar Sahoo Journal of Computational Intelligence in Bioinformatics . 2009,第3期

机译：一种生物序列的快速精确模式匹配算法
4. Faster Algorithms for Sampling and Counting Biological Sequences [C] . Christina Boucher International Symposium on String Processing and Information Retrieval . 2009

机译：用于抽样和计数生物序列的更快算法
5. Fast measurement of heart motion using MRI: Systems, sequences, and algorithms. [D] . Abd-Elmoniem, Khaled Z. 2008

机译：使用MRI快速测量心脏运动：系统，序列和算法。
6. A fast algorithm for exact sequence search in biological sequences using polyphase decomposition [O] . Abhilash Srikantha, Ajit S. Bopardikar, Kalyan Kumar Kaipa, -1

机译：使用多相分解的生物序列中精确序列搜索的快速算法
7. A fast algorithm for exact sequence search in biological sequences using polyphase decomposition [O] . Srikantha, Abhilash, Bopardikar, Ajit S., Kaipa, Kalyan Kumar, 2010

机译：使用多相分解的生物序列中精确序列搜索的快速算法

Faster Algorithms for Sampling and Counting Biological Sequences

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅