首页> 美国卫生研究院文献>other >An Iterative Leave-One-Out Approach to Outlier Detection in RNA-Seq Data
【2h】

An Iterative Leave-One-Out Approach to Outlier Detection in RNA-Seq Data

机译:RNA-Seq数据中离群值检测的迭代留一法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The discrete data structure and large sequencing depth of RNA sequencing (RNA-seq) experiments can often generate outlier read counts in one or more RNA samples within a homogeneous group. Thus, how to identify and manage outlier observations in RNA-seq data is an emerging topic of interest. One of the main objectives in these research efforts is to develop statistical methodology that effectively balances the impact of outlier observations and achieves maximal power for statistical testing. To reach that goal, strengthening the accuracy of outlier detection is an important precursor. Current outlier detection algorithms for RNA-seq data are executed within a testing framework and may be sensitive to sparse data and heavy-tailed distributions. Therefore, we propose a univariate algorithm that utilizes a probabilistic approach to measure the deviation between an observation and the distribution generating the remaining data and implement it within in an iterative leave-one-out design strategy. Analyses of real and simulated RNA-seq data show that the proposed methodology has higher outlier detection rates for both non-normalized and normalized negative binomial distributed data.
机译:RNA测序(RNA-seq)实验的离散数据结构和较大的测序深度通常会在同质组中的一个或多个RNA样本中产生异常值。因此,如何识别和管理RNA-seq数据中的异常值是一个新兴的话题。这些研究工作的主要目标之一是开发一种统计方法,该方法可以有效地平衡离群值观测的影响并实现统计测试的最大功效。为了实现该目标,增强离群值检测的准确性是重要的前提。当前用于RNA序列数据的异常值检测算法是在测试框架内执行的,可能对稀疏数据和重尾分布敏感。因此,我们提出了一种单变量算法,该算法利用概率方法来测量观测值与生成剩余数据的分布之间的偏差,并在迭代式留一法设计策略中实现它。对真实和模拟RNA-seq数据的分析表明,对于非归一化和归一化负二项式分布数据,该方法具有更高的异常值检测率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号