...
首页> 外文期刊>Journal of Chromatography & Separation Techniques >ABOid: A Software for Automated Identification and PhyloproteomicsClassification of Tandem Mass Spectrometric Data
【24h】

ABOid: A Software for Automated Identification and PhyloproteomicsClassification of Tandem Mass Spectrometric Data

机译:避免:一种用于串联质谱数据的自动识别和Phylo蛋白质组学分类的软件

获取原文

摘要

We have developed suite of bioinformatics algorithms for automated identification and classification of microbes based on comparative analysis of protein sequences. This application uses sequence information of microbial proteins revealed by mass spectrometry-based proteomics for identification and phyloproteomics classification. The algorithms transforms results of searching product ion spectra of peptide ions against a protein database, performed by commercially available software (e.g. SEQUEST), into a taxonomically meaningful and easy to interpret output. To achieve this goal we constructed a custom protein database composed of theoretical proteomes derived from all fully sequenced bacterial genomes (1204 microorganisms as of August 25th, 2010) in a FASTA format. Each protein sequence in the database is supplemented with information on a source organism and chromosomal position of each protein coding open reading frame (ORF) is embedded into the protein sequence header. In addition this information is linked with a taxonomic position of each database bacterium. ABOid analyzes SEQUEST search results files to provide the probabilities that peptide sequence assignments to a product ion mass spectrum (MS/MS) are correct and uses the accepted spectruma€“to-sequence matches to generate a sequence-to-organism (STO) matrix of assignments. Because peptide sequences are differentially present or absent in various strains being compared this allows for the classification of bacterial species in a high throughput manner. For this purpose, STO matrices of assignments, viewed as assignment bitmaps, are next analyzed by a ABOid module that uses phylogenetic relationships between bacterial species as a part of decision tree process, and by applying multivariate statistical techniques (principal component and cluster analysis), to reveal relationship of the analyzed unknown sample to the database microorganisms. Our bacterial classification and identification algorithm uses assignments of an analyzed organism to taxonomic groups based on an organized scheme that begins at the phylum level and follows through classes, orders, families and genus down to strain level.
机译:我们已经开发了一套基于蛋白质序列比较分析的微生物信息自动识别和分类的生物信息学算法。本申请使用基于质谱的蛋白质组学揭示的微生物蛋白的序列信息进行鉴定和系统进化分类。该算法将由市售软件(例如SEQUEST)执行的,针对蛋白质数据库搜索肽离子的产物离子谱的结果转换为分类学上有意义且易于解释的输出。为实现此目标,我们构建了一个自定义蛋白质数据库,该数据库由FASTA格式的,由所有完全测序的细菌基因组(截至2010年8月25日为1204个微生物)的理论蛋白质组组成。数据库中的每个蛋白质序列都补充有有关来源生物的信息,并且每个蛋白质编码开放阅读框(ORF)的染色体位置都嵌入了蛋白质序列标题中。另外,该信息与每种数据库细菌的分类位置相关。 ABOid分析SEQUEST搜索结果文件,以提供对产品离子质谱(MS / MS)进行肽序列分配的正确性概率,并使用可接受的光谱与序列匹配来生成序列-生物(STO)矩阵的任务。因为在比较的各种菌株中肽序列差异地存在或不存在,这允许以高通量的方式对细菌种类进行分类。为此,接下来将通过ABOid模块分析STO的分配矩阵(称为分配位图),该模块使用细菌物种之间的系统发育关系作为决策树过程的一部分,并应用多元统计技术(主要成分和聚类分析),揭示分析的未知样品与数据库微生物的关系。我们的细菌分类和识别算法基于组织的计划,将分析的生物体分配到分类组,该计划始于门类级别,然后贯穿类,顺序,科和属,直至菌株级别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号