首页> 美国卫生研究院文献>PLoS Clinical Trials >Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach
【2h】

Customisation of the Exome Data Analysis Pipeline Using a Combinatorial Approach

机译:使用组合方法自定义外显子组数据分析管道

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.
机译:下一代测序(NGS)技术的出现彻底改变了生物学家产生,分析和解释数据的方式。尽管NGS平台提供了一种从单个实验中发现全基因组变体的经济高效方式,但NGS发现的变体需要后续验证,因为与各种测序化学方法相关的错误率很高。最近,与全基因组运行相比,全外显子组测序已被提议作为一种负担得起的选择,但仍需要对所有新型外显子组变异进行后续验证。通常,采用共识方法来克服测序技术,比对和比对后变异检测算法固有的系统性错误。然而,前述方法保证了使用多种测序化学,多种比对工具,多种变异体调用者,这对于时间和金钱而言对于信息学知识有限的个体研究者而言可能是不可行的。生物学家通常缺乏必要的培训来处理NGS运行产生的大量数据,并且在从NGS数据分析的免费可用分析工具列表中进行选择时面临困难。因此,需要定制NGS数据分析流水线,以通过最大程度地减少误报发生率并优先选择正确的分析工具来优先保留真实的变体。为此,我们采样了在对齐和对齐后阶段使用的各种免费可用工具,这些工具建议使用由现有指标的简单框架确定的最合适组合来创建重要数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号