Reference sequence selection for motif searches

机译：主题搜索的参考序列选择

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The planted (l, d) motif search (PMS) is an important yet challenging problem in computational biology. Pattern-driven PMS algorithms usually use k out of t input sequences as reference sequences to generate candidate motifs, and they can find all the (l, d) motifs in the input sequences. However, most of them simply take the first k sequences in the input as reference sequences without elaborate selection processes, and thus they may exhibit sharp fluctuations in running time, especially for large alphabets. In this paper, we build the reference sequence selection problem and propose a method named RefSelect to quickly solve it by evaluating the number of candidate motifs for the reference sequences. RefSelect can bring a practical time improvement of the state-of-the-art pattern-driven PMS algorithms. Experimental results show that RefSelect (1) makes the tested algorithms solve the PMS problem steadily in an efficient way, (2) particularly, makes them achieve a speedup of up to about 100× on the protein data, and (3) is also suitable for large data sets which contain hundreds or more sequences.

机译：种植（L，D）图案搜索（PMS）是计算生物学中的一个重要而有挑战性问题。模式驱动的PMS算法通常使用K OUT输入序列作为用于生成候选图案的参考序列，并且它们可以在输入序列中找到所有（L，D）图案。然而，大多数人只是在输入中将第一k序列作为参考序列，而没有精确选择过程，因此它们可能在运行时间内表现出剧烈的波动，特别是对于大字母。在本文中，我们构建了参考序列选择问题，并提出了一种名为Refselect的方法来快速解决参考序列的候选主题的数量。 Refelect可以带来最先进的模式驱动PM算法的实际时间。实验结果表明，Refsectect（1）使测试算法以有效的方式稳定地解决PMS问题，（2）特别地，使它们在蛋白质数据上实现高达约100倍的加速，并且（3）也是合适的对于包含数百或更多序列的大数据集。

著录项

来源
《IEEE International Conference on Bioinformatics and Biomedicine》|2015年||共6页
会议地点
作者
Yu Qiang; Huo Hongwei; Ruixing Zhao; Dazheng Feng; Vitter Jeffrey Scott; Jun Huan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理技术;
关键词
Refining; Planted (l; d) motif search; pattern-driven; reference sequences;

机译：炼油;种植（L;D）图案搜索;模式驱动;参考序列;

相似文献

外文文献
中文文献
专利

1. SIRW: a web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches [J] . Chenna Ramu Nucleic Acids Research . 2003,第13期

机译：SIRW：用于简单索引和检索系统的Web服务器，该服务器将序列主题搜索与关键字搜索结合在一起
2. A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences [J] . Mehmet Bilgen, Mehmet Karaca, A. Nad Onus, Bioinformatics . 2004,第18期

机译：结合了序列基序搜索和关键字的软件程序，用于查找包含DNA序列的重复序列
3. A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences [J] . Mehmet Bilgen, Mehmet Karaca, A. Nad Onus, Bioinformatics . 2004,第18期

机译：结合了序列基序搜索和关键字的软件程序，用于查找包含DNA序列的重复
4. Reference sequence selection for motif searches [C] . Yu Qiang, Huo Hongwei, Ruixing Zhao, IEEE International Conference on Bioinformatics and Biomedicine . 2015

机译：主题搜索的参考序列选择
5. Computational methods for finding regulatory DNA motifs using sequence characteristics and positional preferences [D] . Tharakaraman, Kannan 2008

机译：使用序列特征和位置偏好来发现调节性DNA图案的计算方法
6. SIRW: a web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches [O] . Chenna Ramu 2003

机译：SIRW：用于简单索引和检索系统的Web服务器它结合了序列主题搜索和关键字搜索
7. SIRW: a web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches [O] . Ramu, Chenna 2003

机译：SIRW：用于简单索引和检索系统的Web服务器，它结合了序列主题搜索和关键字搜索

Reference sequence selection for motif searches

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅