Variations on probabilistic suffix trees: statistical modeling and prediction of protein families.

Bejerano G; Yona G

首页> 外文期刊>Bioinformatics >Variations on probabilistic suffix trees: statistical modeling and prediction of protein families.

【24h】

Variations on probabilistic suffix trees: statistical modeling and prediction of protein families.

机译：概率后缀树的变化：蛋白质家族的统计建模和预测。

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

MOTIVATION: We present a method for modeling protein families by means of probabilistic suffix trees (PSTs). The method is based on identifying significant patterns in a set of related protein sequences. The patterns can be of arbitrary length, and the input sequences do not need to be aligned, nor is delineation of domain boundaries required. The method is automatic, and can be applied, without assuming any preliminary biological information, with surprising success. Basic biological considerations such as amino acid background probabilities, and amino acids substitution probabilities can be incorporated to improve performance. RESULTS: The PST can serve as a predictive tool for protein sequence classification, and for detecting conserved patterns (possibly functionally or structurally important) within protein sequences. The method was tested on the Pfam database of protein families with more than satisfactory performance. Exhaustive evaluations show that the PST model detects much more related sequences than pairwise methods such as Gapped-BLAST, and is almost as sensitive as a hidden Markov model that is trained from a multiple alignment of the input sequences, while being much faster.

机译：动机：我们介绍一种通过概率后缀树（PST）建模蛋白质家族的方法。该方法基于鉴定一组相关蛋白质序列中的重要模式。模式可以是任意长度，输入序列不需要对齐，也不需要描述域边界。该方法是自动的，并且可以在不假设任何初步生物学信息的情况下应用，取得了令人惊讶的成功。可以将诸如氨基酸背景概率和氨基酸取代概率之类的基本生物学考虑因素纳入考量，以提高性能。结果：PST可以用作蛋白质序列分类的预测工具，并可以检测蛋白质序列内的保守模式（可能在功能上或结构上很重要）。该方法在蛋白质家族的Pfam数据库上进行了测试，性能令人满意。详尽的评估表明，与成对方法（例如Gapped-BLAST）相比，PST模型检测到更多相关序列，并且几乎与从输入序列的多个比对中训练出的隐马尔可夫模型一样敏感，同时速度更快。

著录项

来源
《Bioinformatics》 |2001年第1期|共21页
作者
Bejerano G; Yona G;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类生物科学;
关键词
Models; Statistical; Proteins; Sequence Analysis; Protein methods; 模型; 统计学; 蛋白质类;

机译：Models;Statistical;Proteins;Sequence Analysis;Protein methods;模型;统计学;蛋白质类;

相似文献

外文文献
中文文献
专利

1. Variations on probabilistic suffix trees: statistical modeling and prediction of protein families. [J] . Bejerano G, Yona G Bioinformatics . 2001,第1期

机译：概率后缀树的变化：蛋白质家族的统计建模和预测。
2. Probabilistic suffix array: efficient modeling and prediction of protein families [J] . Bing-Hua Jiang Bioinformatics . 2012,第10期

机译：概率后缀数组：蛋白质家族的有效建模和预测
3. Horizontally scalable probabilistic generalized suffix tree (PGST) based route prediction using map data and GPS traces [J] . Vishnu Shankar Tiwari, Arti Arya Journal of Big Data . 2017,第1期

机译：使用地图数据和GPS轨迹的基于水平可扩展概率广义后缀树（PGST）的路线预测
4. Local prediction approach for protein classification using probabilistic suffix trees [C] . Zhaohui Sun, Jitender S. Deogun Proceedings of the Second conference on Asia-Pacific bioinformatics . 2004

机译：使用概率后缀树进行蛋白质分类的局部预测方法
5. Prediction of Protein Function with a Probabilistic Model for Analysis of Sequence Similarity Networks and Genomic Context [D] . Yunes, Jeffrey Michael. 2018

机译：利用概率模型预测蛋白质相似性网络和基因组背景的蛋白质功能
6. Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling [O] . Armin Meier, Johannes Söding 2015

机译：通过概率多模板同源性建模自动预测蛋白质3D结构
7. Variations on Probabilistic Suffix Trees: Statistical Modeling and Prediction of Protein Families [O] . Gill Bejerano, Golan Yona 2001

机译：概率后缀树的变化：蛋白质家族的统计建模和预测

Variations on probabilistic suffix trees: statistical modeling and prediction of protein families.

摘要

著录项

相似文献

相关主题

期刊订阅