首页> 美国卫生研究院文献>Nucleic Acids Research >PolyA-miner: accurate assessment of differential alternative poly-adenylation from 3′Seq data using vector projections and non-negative matrix factorization
【2h】

PolyA-miner: accurate assessment of differential alternative poly-adenylation from 3′Seq data using vector projections and non-negative matrix factorization

机译:PolyA-miner:使用矢量投影和非负矩阵分解从3Seq数据准确评估差异性替代聚腺苷酸

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Almost 70% of human genes undergo alternative polyadenylation (APA) and generate mRNA transcripts with varying lengths, typically of the 3′ untranslated regions (UTR). APA plays an important role in development and cellular differentiation, and its dysregulation can cause neuropsychiatric diseases and increase cancer severity. Increasing awareness of APA’s role in human health and disease has propelled the development of several 3′ sequencing (3′Seq) techniques that allow for precise identification of APA sites. However, despite the recent data explosion, there are no robust computational tools that are precisely designed to analyze 3′Seq data. Analytical approaches that have been used to analyze these data predominantly use proximal to distal usage. With about 50% of human genes having more than two APA isoforms, current methods fail to capture the entirety of APA changes and do not account for non-proximal to non-distal changes. Addressing these key challenges, this study demonstrates PolyA-miner, an algorithm to accurately detect and assess differential alternative polyadenylation specifically from 3′Seq data. Genes are abstracted as APA matrices, and differential APA usage is inferred using iterative consensus non-negative matrix factorization (NMF) based clustering. PolyA-miner accounts for all non-proximal to non-distal APA switches using vector projections and reflects precise gene-level 3′UTR changes. It can also effectively identify novel APA sites that are otherwise undetected when using reference-based approaches. Evaluation on multiple datasets—first-generation MicroArray Quality Control (MAQC) brain and Universal Human Reference (UHR) PolyA-seq data, recent glioblastoma cell line knockdown Poly(A)-ClickSeq (PAC-seq) data, and our own mouse hippocampal and human stem cell-derived neuron PAC-seq data—strongly supports the value and protocol-independent applicability of PolyA-miner. Strikingly, in the glioblastoma cell line data, PolyA-miner identified more than twice the number of genes with APA changes than initially reported. With the emerging importance of APA in human development and disease, PolyA-miner can significantly improve data analysis and help decode the underlying APA dynamics.
机译:几乎70%的人类基因经历了交替的多聚腺苷酸化(APA),并生成具有不同长度的mRNA转录本,通常是3'非翻译区(UTR)。 APA在发育和细胞分化中起着重要作用,其失调可引起神经精神疾病并增加癌症严重程度。人们对APA在人类健康和疾病中的作用的认识不断提高,推动了几种3'测序(3'Seq)技术的发展,这些技术可精确识别APA位点。但是,尽管最近出现了数据爆炸,但仍没有可靠地设计用于分析3'Seq数据的强大计算工具。已经用于分析这些数据的分析方法主要使用近端到远端用法。由于约有50%的人类基因具有两个以上的APA亚型,目前的方法无法捕获APA的全部变化,并且无法解释非近端到非远端的变化。针对这些关键挑战,本研究演示了PolyA-miner,该算法可准确地从3'Seq数据中准确检测和评估差异性替代聚腺苷酸。将基因抽象为APA矩阵,并使用基于迭代共识非负矩阵分解(NMF)的聚类来推断APA的使用差异。 PolyA-miner使用矢量投影说明了所有非近距离到非远距离APA切换,并反映了精确的基因水平3'UTR变化。它还可以有效地识别使用基于参考的方法时未发现的新型APA站点。评估多个数据集-第一代微阵列质量控制(MAQC)脑和通用人类参考(UHR)PolyA-seq数据,最近的胶质母细胞瘤细胞系敲除Poly(A)-ClickSeq(PAC-seq)数据以及我们自己的小鼠海马和人类干细胞来源的神经元PAC-seq数据-强烈支持PolyA-miner的价值和与协议无关的适用性。令人惊讶的是,在胶质母细胞瘤细胞系数据中,PolyA-miner鉴定出APA变化的基因数量是最初报道的两倍以上。随着APA在人类发展和疾病中日益重要的地位,PolyA-miner可以显着改善数据分析并帮助解码潜在的APA动态。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号