首页> 外文期刊>BMC Bioinformatics >SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics
【24h】

SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics

机译:SASD:合成替代剪接数据库,用于从蛋白质组学中鉴定新型同工型

获取原文
       

摘要

BackgroundAlternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics.Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis.ResultsWe used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides.The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) in liver cancer and 2) in breast cancer.ConclusionsThe SASD provides the scientific community with an efficient means to identify, analyze, and characterize novel Exon Skipping and Intron Retention protein isoforms from mass spectrometry and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing.
机译:背景技术选择性剪接是产生蛋白质多样性和调节蛋白质表达的重要且广泛的机制。在蛋白质水平上的高通量鉴定和替代剪接的分析比在mRNA水平上具有更多的优势。替代剪接数据库和串联质谱的结合为蛋白质组学中潜在的新型替代剪接蛋白同工型的鉴定,分析和表征提供了有力的技术,因此,基于人类蛋白质同工型的肽组数据库进行蛋白质组学实验,我们的目标是设计一个新的替代剪接数据库,用于:1)提供更多的基因,转录本和替代剪接范围; 2)仅专注于替代剪接,以及3)执行特定于上下文的替代剪接分析。结果我们使用了三步流水线来创建合成的替代剪接数据库(SASD),以识别新的替代剪接同工型,并在途径,疾病,药物和器官特异性或定制基因集的背景下对其进行解释,并最大限度地覆盖替代剪接。首先,我们在Ensembl Genes 71数据库中提取了所有基因的基因结构信息,并整合了Integrated Pathway Analysis Database。然后,我们编辑了人工剪接笔录。最后,我们将人工转录本翻译为替代剪接肽.SASD是一个全面的数据库,包含56,630个基因(Ensembl基因ID),95,260个转录本(Ensembl转录本ID)和11,919,779种替代剪接肽,还涵盖约1,956种途径,6,704种疾病,5,615种药物和52个器官。该数据库具有基于Web的用户界面,允许用户搜索,显示和下载单个基因/转录本/蛋白质,自定义基因集,途径,疾病,药物,器官相关的替代剪接。此外,通过与其他已知数据库的比较和两个案例研究,对数据库的质量进行了验证:1)肝癌和2)乳腺癌。结论SASD为科学界提供了一种有效的手段来识别,分析和表征新型来自质谱的外显子跳过和内含子保留蛋白同工型,并在途径,疾病,药物和器官特异性或定制基因集的背景下解释它们,并具有最大的覆盖范围,并且侧重于其他剪接。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号