...
首页> 外文期刊>BMC Genomics >Discerning novel splice junctions derived from RNA-seq alignment: a deep learning approach
【24h】

Discerning novel splice junctions derived from RNA-seq alignment: a deep learning approach

机译:辨别源自RNA-SEQ对齐的新型接头结:深度学习方法

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Exon splicing is a regulated cellular process in the transcription of protein-coding genes. Technological advancements and cost reductions in RNA sequencing have made quantitative and qualitative assessments of the transcriptome both possible and widely available. RNA-seq provides unprecedented resolution to identify gene structures and resolve the diversity of splicing variants. However, currently available ab initio aligners are vulnerable to spurious alignments due to random sequence matches and sample-reference genome discordance. As a consequence, a significant set of false positive exon junction predictions would be introduced, which will further confuse downstream analyses of splice variant discovery and abundance estimation. In this work, we present a deep learning based splice junction sequence classifier, named DeepSplice, which employs convolutional neural networks to classify candidate splice junctions. We show (I) DeepSplice outperforms state-of-the-art methods for splice site classification when applied to the popular benchmark dataset HS3D, (II) DeepSplice shows high accuracy for splice junction classification with GENCODE annotation, and (III) the application of DeepSplice to classify putative splice junctions generated by Rail-RNA alignment of 21,504 human RNA-seq data significantly reduces 43 million candidates into around 3 million highly confident novel splice junctions. A model inferred from the sequences of annotated exon junctions that can then classify splice junctions derived from primary RNA-seq data has been implemented. The performance of the model was evaluated and compared through comprehensive benchmarking and testing, indicating a reliable performance and gross usability for classifying novel splice junctions derived from RNA-seq alignment.
机译:外显子剪接是蛋白质编码基因转录中的受调节的细胞过程。 RNA测序的技术进步和成本降低已经对转录组进行的量化和定性评估,既有可能和广泛可用。 RNA-SEQ提供前所未有的分辨率以鉴定基因结构并解决剪接变体的多样性。然而,由于随机序列匹配和样本参考基因组不道德,目前可用的AB Initio对准器容易受到虚假对齐。因此,将引入一系列大量的假阳性外显子连接预测,这将进一步混淆剪接变异发现和丰度估计的下游分析。在这项工作中,我们提出了一个基于深度学习的剪接结序列分类器,名为DeepSplice,它采用卷积神经网络来分类候选剪接结。我们展示(i)DeepSplice优于剪接站点分类的最先进方法,当应用于流行的基准数据集HS3D时,(ii)DeepSplice对带有Gencode注释的剪接结分类和(iii)的应用显示了高精度,以及(iii)的应用DeepSplice来分类由轨道-RNA对准21,504个人RNA-SEQ数据产生的推定接头结,显着将4300万候选人减少到大约300万个高度自信的新型剪接连接点。从注释的外显子连接序列推断的模型,然后可以实施可以分类来自初级RNA-SEQ数据的接头连接点。通过全面的基准测试和测试进行评估和比较模型的性能,表明可靠的性能和可用性,用于分类来自RNA-SEQ对准的新型接头连接点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号