首页> 外文会议>International conference on bioinformatics and computational biology >Integrating RNA-seq transcript signals, Primary and Secondary Structure Information in Differentiating Coding and Non-coding RNA Transcripts
【24h】

Integrating RNA-seq transcript signals, Primary and Secondary Structure Information in Differentiating Coding and Non-coding RNA Transcripts

机译:整合RNA-seq转录信号,一级和二级结构信息以区分编码和非编码RNA转录本

获取原文

摘要

Non-coding RNA transcripts play important roles in different regulatory functions inside the cell which were previously treated as "junk". Exploring the differences between the non-coding transcripts with the protein coding transcripts helps researchers annotate many novel RNA transcripts. Researchers already showed that the secondary structures of non-coding RNA transcripts can be used to distinguish them from coding transcripts. But, there is a large class of non-coding transcripts that are completely unstructured, thereby making the structure-based classification results unreliable. However, the advent of Next Generation Sequencing platform, RNA-seq, producing more and more expression data of many organisms under different set of conditions, adds a new dimension to this classification task. In this paper, we presented a way to integrate the primary structures (i.e., the sequence), secondary structures and differential expressions of the two types of transcripts obtained from RNA-seq experiments under three different conditions and design a better classification method. We employed two popular classification algorithms-SVMs and Conditional Random Forest classifier to build the model to discriminate the two classes. The prediction accuracy of these classifiers are 93% and 91% respectively which are much higher than those of the two existing classification systems - CPC and PORTRAIT.
机译:非编码RNA转录本在细胞内部的不同调节功能中扮演着重要角色,这些功能以前被视为“垃圾”。探索非编码转录本与蛋白质编码转录本之间的差异,有助于研究人员注释许多新颖的RNA转录本。研究人员已经表明,非编码RNA转录本的二级结构可用于将其与编码转录本区分开。但是,有一大类完全非结构化的非编码转录本,从而使基于结构的分类结果不可靠。但是,新一代测序平台RNA-seq的出现在不同条件下产生了越来越多的许多生物的表达数据,为这一分类任务增加了新的维度。在本文中,我们提出了一种在三种不同条件下整合从RNA-seq实验获得的两种转录本的一级结构(即序列),二级结构和差异表达的方法,并设计了一种更好的分类方法。我们采用了两种流行的分类算法-SVM和条件随机森林分类器来构建用于区分这两个类别的模型。这些分类器的预测准确度分别为93%和91%,远高于两个现有分类系统CPC和PORTRAIT。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号