首页> 外文期刊>Frontiers in Bioengineering and Biotechnology >G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods
【24h】

G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods

机译:G-CNV:一种基于GPU的工具,用于准备数据以使用深度方法检测CNV

获取原文
           

摘要

Copy Number Variations (CNVs) are the most prevalent types of structural variations (SVs) in the human genome and are involved in a wide range of common human diseases. Different computational methods have been devised to detect this type of SVs and to study how they are implicated in human diseases. Recently, computational methods based on high throughput sequencing (HTS) are increasingly used. The majority of these methods focus on mapping short-read sequences generated from a donor against a reference genome to detect signatures distinctive of CNVs. In particular, read-depth based methods detect CNVs by analyzing genomic regions with significantly different read-depth from the other ones. The pipeline analysis of these methods consists of four main stages: i) data preparation, ii) data normalization, iii) CNV regions identification, and iv ) copy number estimation. However, available tools do not support most of the operations required at the first two stages of this pipeline. Typically, they start the analysis by building the read-depth signal from pre-processed alignments. Therefore, third-party tools must be used to perform most of the preliminary operations required to build the read-depth signal. These data-intensive operations can be efficiently parallelized on Graphics Processing Units (GPUs). In this article we present G-CNV, a GPU-based tool devised to perform the common operations required at the first two stages of the analysis pipeline. G-CNV is able to filter low quality read sequences, to mask low quality nucleotides, to remove adapter sequences, to remove duplicated read sequences, to map the short-reads, to resolve multiple mapping ambiguities, to build the read-depth signal, and to normalize it. G-CNV can be efficiently used as a third-party tool able to prepare data for the subsequent read-depth signal generation and analysis. Moreover, it can also be integrated in CNV detection tools to generate read-depth signals.
机译:拷贝数变异(CNV)是人类基因组中最普遍的结构变异(SV)类型,涉及多种常见的人类疾病。已经设计出不同的计算方法来检测这种类型的SV,并研究它们如何与人类疾病相关。近年来,越来越多地使用基于高通量排序(HTS)的计算方法。这些方法中的大多数专注于将供体产生的短读序列与参考基因组作图,以检测CNV特有的特征。特别是,基于读取深度的方法通过分析读取深度与其他读取深度明显不同的基因组区域来检测CNV。这些方法的流水线分析包括四个主要阶段:i)数据准备,ii)数据归一化,iii)CNV区域识别和iv)拷贝数估计。但是,可用工具不支持该管道的前两个阶段所需的大多数操作。通常,他们通过从预处理的比对中构建读取深度信号来开始分析。因此,必须使用第三方工具来执行构建读取深度信号所需的大多数初步操作。这些数据密集型操作可以在图形处理单元(GPU)上有效地并行化。在本文中,我们介绍G-CNV,这是一种基于GPU的工具,旨在执行分析流程的前两个阶段所需的常见操作。 G-CNV能够过滤低质量的阅读序列,以掩盖低质量的核苷酸,去除衔接子序列,去除重复的阅读序列,以定位短读图,解决多个定位模糊问题,以构建读取深度信号,并将其标准化。 G-CNV可以有效地用作第三方工具,能够为后续的读取深度信号生成和分析准备数据。此外,它也可以集成到CNV检测工具中以生成读取深度信号。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号