Reference sequence selection for motif searches

机译：主题搜索的参考序列选择

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The planted (l, d) motif search (PMS) is an important yet challenging problem in computational biology. Pattern-driven PMS algorithms usually use k out of t input sequences as reference sequences to generate candidate motifs, and they can find all the (l, d) motifs in the input sequences. However, most of them simply take the first k sequences in the input as reference sequences without elaborate selection processes, and thus they may exhibit sharp fluctuations in running time, especially for large alphabets. In this paper, we build the reference sequence selection problem and propose a method named RefSelect to quickly solve it by evaluating the number of candidate motifs for the reference sequences. RefSelect can bring a practical time improvement of the state-of-the-art pattern-driven PMS algorithms. Experimental results show that RefSelect (1) makes the tested algorithms solve the PMS problem steadily in an efficient way, (2) particularly, makes them achieve a speedup of up to about 100× on the protein data, and (3) is also suitable for large data sets which contain hundreds or more sequences.

机译：植入的（l，d）主题搜索（PMS）是计算生物学中一个重要但具有挑战性的问题。模式驱动的PMS算法通常使用t个输入序列中的k个作为参考序列来生成候选基序，并且它们可以找到输入序列中的所有（l，d）基序。但是，它们中的大多数仅将输入中的前k个序列作为参考序列，而无需进行复杂的选择过程，因此，它们的运行时间可能会出现急剧波动，尤其是对于大字母而言。在本文中，我们构建了参考序列选择问题，并提出了一种名为RefSelect的方法，通过评估参考序列的候选基序数量来快速解决该问题。 RefSelect可以带来最先进的模式驱动PMS算法的实际时间改进。实验结果表明，RefSelect（1）使所测试的算法有效地稳定地解决了PMS问题，（2）特别是使它们在蛋白质数据上达到了约100倍的加速，并且（3）也适用对于包含数百个或更多序列的大型数据集。

著录项

来源
《IEEE International Conference on Bioinformatics and Biomedicine》|2015年|569-574|共6页
会议地点
作者
Yu Qiang; Huo Hongwei; Ruixing Zhao; Dazheng Feng; Vitter Jeffrey Scott; Jun Huan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Refining; Planted (l, d) motif search; pattern-driven; reference sequences;

机译：精制;种植的（l，d）主题搜索;模式驱动参考序列;

相似文献

外文文献
中文文献
专利

1. SIRW: a web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches [J] . Chenna Ramu Nucleic Acids Research . 2003,第13期

机译：SIRW：用于简单索引和检索系统的Web服务器，该服务器将序列主题搜索与关键字搜索结合在一起
2. A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences [J] . Mehmet Bilgen, Mehmet Karaca, A. Nad Onus, Bioinformatics . 2004,第18期

机译：结合了序列基序搜索和关键字的软件程序，用于查找包含DNA序列的重复序列
3. A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences [J] . Mehmet Bilgen, Mehmet Karaca, A. Nad Onus, Bioinformatics . 2004,第18期

机译：结合了序列基序搜索和关键字的软件程序，用于查找包含DNA序列的重复
4. Reference sequence selection for motif searches [C] . Yu Qiang, Huo Hongwei, Ruixing Zhao, IEEE International Conference on Bioinformatics and Biomedicine . 2015

机译：主题搜索的参考序列选择
5. Computational methods for finding regulatory DNA motifs using sequence characteristics and positional preferences [D] . Tharakaraman, Kannan 2008

机译：使用序列特征和位置偏好来发现调节性DNA图案的计算方法
6. SIRW: a web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches [O] . Chenna Ramu 2003

机译：SIRW：用于简单索引和检索系统的Web服务器它结合了序列主题搜索和关键字搜索
7. SIRW: a web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches [O] . Ramu, Chenna 2003

机译：SIRW：用于简单索引和检索系统的Web服务器，它结合了序列主题搜索和关键字搜索

Reference sequence selection for motif searches

摘要

著录项

相似文献

相关主题

期刊订阅