首页> 美国卫生研究院文献>BMC Bioinformatics >Empirical estimation of sequencing error rates using smoothing splines
【2h】

Empirical estimation of sequencing error rates using smoothing splines

机译:使用平滑样条的序列错误率的经验估计

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BackgroundNext-generation sequencing has been used by investigators to address a diverse range of biological problems through, for example, polymorphism and mutation discovery and microRNA profiling. However, compared to conventional sequencing, the error rates for next-generation sequencing are often higher, which impacts the downstream genomic analysis. Recently, Wang et al. (BMC Bioinformatics 13:185, 2012) proposed a shadow regression approach to estimate the error rates for next-generation sequencing data based on the assumption of a linear relationship between the number of reads sequenced and the number of reads containing errors (denoted as shadows). However, this linear read-shadow relationship may not be appropriate for all types of sequence data. Therefore, it is necessary to estimate the error rates in a more reliable way without assuming linearity. We proposed an empirical error rate estimation approach that employs cubic and robust smoothing splines to model the relationship between the number of reads sequenced and the number of shadows.
机译:背景技术研究人员已使用下一代测序技术,例如通过多态性和突变发现以及microRNA分析来解决各种各样的生物学问题。但是,与常规测序相比,下一代测序的错误率通常更高,这会影响下游基因组分析。最近,Wang等。 (BMC Bioinformatics 13:185,2012)提出了一种影子回归方法,基于对测序的读取数与包含错误的读取数(表示为阴影)之间的线性关系的假设,估算下一代测序数据的错误率)。但是,这种线性读影关系可能不适用于所有类型的序列数据。因此,有必要在不假设线性的情况下以更可靠的方式估计误差率。我们提出了一种经验错误率估计方法,该方法采用三次和鲁棒的平滑样条来对序列化的读取数与阴影数之间的关系进行建模。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号