首页> 外文期刊>Molecular & cellular proteomics: MCP >N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana
【24h】

N-terminal Proteomics Assisted Profiling of the Unexplored Translation Initiation Landscape in Arabidopsis thaliana

机译:N-末端蛋白质组学辅助拟南芥意识形症中未开发的翻译启动景观探讨

获取原文
获取原文并翻译 | 示例
           

摘要

Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well-and poorly-annotated genomes.
机译:蛋白质组织是一种新兴的研究领域,但缺乏统一的分析方法。组合N-末端蛋白质组学和核糖体分析的蛋白素学研究表明,在基因组注释中目前缺少大量的蛋白质开始点。我们构建了一种用于分析N末端蛋白质组学数据的突育方法,目的是发现外部注释的蛋白质编码区域的新型平移起始位点。总之,未识别的MS / MS光谱与包含在拟南芥基因组中编码的蛋白质n末端的特异性N-末端肽文库。在严格的假发现速率过滤后,发现符合N-末端甲硫氨酸切除特异性并指示翻译引发的117个蛋白质n颗粒。这些包括N-末端蛋白质延伸和来自可转换元素和假性的翻译。基因预测提供支持蛋白质编码模型的蛋白质n末端的一半。除了在新预测的ORFS内包含的功能域(部分)的预测外,在最近发布的Araport11基因组中发现了进一步支持翻译证据,并在公共存储库中存储的序列的计算翻译。最有意思地,发现了核糖体分析的互补证据,用于23个蛋白质n末端。最后,通过分析蛋白质N-末端肽,Silico分析中的A中的一种显示我们的N-末端蛋白素学学策略在具有良好且良好的注释基因组的物种中揭示蛋白质编码潜力的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号