Sequence Database Search Using Jumping Alignments

机译：使用跳转对齐进行序列数据库搜索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We describe a new algorithm for amino acid sequence classification and the detection of remote homologues. The rationale is to exploit both vertical and horizontal information of a multiple alignment in a well balanced manner. This is in contrast to established methods like profiles and hidden Markov models which focus on vertical information as they model the columns of the alignment independently. In our setting, we want to select from a given database of "candidate sequences" those proteins that belong to a given superfamily. In order to do so, each candidate sequence is separately tested against a multiple alignment of the known members of the superfamily by means of a new jumping alignment algorithm. This algorithm is an extension of the Smith-Waterman algorithm and computes a local alignment of a single sequence and a multiple alignment. In contrast to traditional methods, however, this alignment is not based on a summary of the individual columns of the multiple alignment. Rather, the candidate sequence at each position is aligned to one sequence of the multiple alignment, called the "reference sequence". In addition, the reference sequence may change within the alignment, while each such jump is penalized. To evaluate the discriminative quality of the jumping alignment algorithm, we compared it to hidden Markov models on a subset of the SCOP database of protein domains. The discriminative quality was assessed by counting the number of false positives that ranked higher than the first true positive (FP-count). For moderate FP-counts above five, the number of successful searches with our method was considerably higher than with hidden Markov models.

机译：我们描述了氨基酸序列分类的新算法和远程的同系物的检测。其基本原理是利用多重比对的在良好平衡的方式垂直和水平的信息。这是相对于状轮廓和隐马尔可夫模型，其专注于垂直的信息，因为他们独立建模对齐的列建立的方法。在我们的设置，我们想从“候选序列”那些属于一个家族给定蛋白质的特定数据库中选择。为了做到这一点，每个候选序列分别针对超家族的成员已知的多重比对用新的跳跃比对算法的装置进行测试。这个算法是Smith-Waterman算法的扩展，并计算一个单一序列和多序列比对的局部比对。相较于传统的方法，但是，这种定位不是基于多重排列的各列的摘要。相反，在每个位置的候选序列与所述多个对准的一个序列，称为“参考序列”。此外，该参考序列可以对准内变化，而每个这样的跳跃惩罚。为了评估跳跃比对算法的辨别质量，我们将它比作隐马尔可夫模型的蛋白质结构域的SCOP数据库的一个子集。该判别质量通过计算比第一个真正的阳性（FP-数）排名较高的假阳性的数量进行评估。对于中度FP-数以上的5家，我们的方法成功的搜索次数比用隐马尔可夫模型要高得多。

著录项

来源
《International Conference on Intelligent Systems for Molecular Biology》|2000年||共9页
会议地点
作者
Rainer Spang; Marc Rehmsmeier; Jens Stoye;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类分子生物学;
关键词

相似文献

外文文献
中文文献
专利

1. Automated protein sequence database classification.I.Integration of compositional similarity search,local similarity search,and multiple sequence alignment [J] . Jerome Gracy... Bioinformatics . 1998,第2期

机译：自动化蛋白质序列数据库分类.I。组成相似性搜索，局部相似性搜索和多序列比对的整合
2. Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases. [J] . Wallqvist A, Fukunishi Y, Murphy LR, Bioinformatics . 2000,第11期

机译：重复序列/二级结构搜索蛋白质同源物：与氨基酸序列比对进行比较，并应用于基因组数据库中的折叠识别。
3. A Local Alignment Metric for Accelerating Biosequence Database Search [J] . Peter A. Spiro, Natasa Macura Journal of computational biology: A journal of computational molecular cell biology . 2004,第1期

机译：用于加速生物序列数据库搜索的本地比对指标
4. Sequence Database Search Using Jumping Alignments [C] . Rainer Spang, Marc Rehmsmeier, Jens Stoye International Conference on Intelligent Systems for Molecular Biology; 20000816-23; La Jolla,CA(US) . 2000

机译：使用跳跃比对的序列数据库搜索
5. Relatedness of biological sequences using alignment and restriction map databases. [D] . Kim, Jin. 1996

机译：使用比对和限制性图谱数据库的生物序列的相关性。
6. Scalable metagenomics alignment research tool (SMART): a scalable rapid and complete search heuristic for the classification of metagenomic sequences from complex sequence populations [O] . Aaron Y. Lee, Cecilia S. Lee, Russell N. Van Gelder 2016

机译：可扩展的宏基因组学比对研究工具（SMART）：可扩展快速完整的搜索启发式用于对复杂序列群体中的宏基因组序列进行分类
7. Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment [O] . J. Gracy, P. Argos 1998

机译：自动蛋白质序列数据库分类。 I.集成组成相似性搜索，局部相似性搜索和多个序列对齐的集成

Sequence Database Search Using Jumping Alignments

摘要

著录项

相似文献

相关主题

期刊订阅