首页> 外文期刊>電子情報通信学会技術研究報告 >A Method for Isoform Prediction from RNA-Seq Data by Iterative Mapping
【24h】

A Method for Isoform Prediction from RNA-Seq Data by Iterative Mapping

机译:通过迭代映射从RNA-Seq数据预测同工型的方法

获取原文
获取原文并翻译 | 示例
       

摘要

Alternative splicing plays an important role in eukaryotic gene expression by producing diverse proteins from a single gene. Predicting how genes are transcribed is of great biological interest. To this end, massively parallel whole transcriptome sequencing, often referred to as RNA-Seq, is becoming widely used and is revolutionizing the cataloging isoforms using a vast number of short mRNA fragments called reads. Conventional RNA-Seq analysis methods typically align reads onto a reference genome (mapping) in order to capture the form of isoforms that each gene yields and how much of every isoform is expressed from an RNA-Seq dataset. However, a considerable number of reads cannot be mapped uniquely. Those so-called multireads that are mapped onto multiple locations due to short read length and analogous sequences inflate the uncertainty as to how genes are transcribed. This causes inaccurate gene expression estimations and leads to incorrect isoform prediction. To cope with this problem, we propose a method for isoform prediction by iterative mapping. The positions from which multireads originate can be estimated based on the information of expression levels, whereas quantification of isoform-level expression requires accurate mapping. These procedures are mutually dependent, and therefore remapping reads is essential. By iterating this cycle, our method estimates gene expression levels more precisely and hence improves predictions of alternative splicing. Our method simultaneously estimates isoform-level expressions by computing how many reads originate from each candidate isoform using an EM algorithm within a gene. To validate the effectiveness of the proposed method, we compared its performance with conventional methods using an RNA-Seq dataset derived from a human brain. The proposed method had a precision of 66.7% and outperformed conventional methods in terms of the isoform detection rate.
机译:通过从单个基因产生多种蛋白质,选择性剪接在真核基因表达中起重要作用。预测基因如何转录具有重要的生物学意义。为此,大规模并行的完整转录组测序(通常称为RNA-Seq)正在得到广泛使用,并且正在使用称为读取的大量短mRNA片段彻底改变同种型。常规的RNA-Seq分析方法通常将读数比对参考基因组(作图),以捕获每个基因产生的同工型的形式以及从RNA-Seq数据集中表达每种同工型的多少。但是,大量读取无法唯一映射。由于短读长度和类似序列而被映射到多个位置的那些所谓的多读增加了​​基因转录的不确定性。这导致不正确的基因表达估计,并导致错误的亚型预测。为了解决这个问题,我们提出了一种通过迭代映射进行异构体预测的方法。可以基于表达水平的信息来估计产生多读的位置,而对同工型水平表达的定量则需要精确的定位。这些过程是相互依赖的,因此重新映射读取至关重要。通过重复此循环,我们的方法可以更精确地估计基因表达水平,从而改善对选择性剪接的预测。我们的方法通过使用基因内的EM算法计算来自每个候选同工型的读数,从而同时估算同工型水平的表达。为了验证所提出方法的有效性,我们使用源自人脑的RNA-Seq数据集将其性能与常规方法进行了比较。所提方法的精密度为66.7%,同工型检出率优于传统方法。

著录项

  • 来源
    《電子情報通信学会技術研究報告》 |2012年第108期|71-77|共7页
  • 作者单位

    Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka 565-0871, Japan;

    Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka 565-0871, Japan;

    Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka 565-0871, Japan;

    Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka 565-0871, Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 jpn
  • 中图分类
  • 关键词

    RNA-Seq; alternative splicing; isoform; mapping;

    机译:RNA序列替代拼接;亚型映射;
  • 入库时间 2022-08-18 00:29:13

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号