SRBreak: A Read-Depth and Split-Read Framework to Identify Breakpoints of Different Events Inside Simple Copy-Number Variable Regions

Hoang T. Nguyen; James Boocock; Tony R. Merriman; Michael A. Black

首页> 外文期刊>Frontiers in Genetics >SRBreak: A Read-Depth and Split-Read Framework to Identify Breakpoints of Different Events Inside Simple Copy-Number Variable Regions

【24h】

SRBreak: A Read-Depth and Split-Read Framework to Identify Breakpoints of Different Events Inside Simple Copy-Number Variable Regions

机译：SRBreak：一个读取深度和拆分读取框架，用于识别简单拷贝数可变区域内不同事件的断点

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Copy-number variation (CNV) has been associated with increased risk of complex diseases. High-throughput sequencing (HTS) technologies facilitate the detection of copy-number variable regions (CNVRs) and their breakpoints. This helps in understanding genome structure as well as their evolution process. Various approaches have been proposed for detecting CNV breakpoints, but currently it is still challenging for tools based on a single analysis method to identify breakpoints of CNVs. It has been shown, however, that pipelines which integrate multiple approaches are able to report more reliable breakpoints. Here, based on HTS data, we have developed a pipeline to identify approximate breakpoints (±10 bp) relating to different ancestral events within a specific CNVR. The pipeline combines read-depth and split-read information to infer breakpoints, using information from multiple samples to allow an imputation approach to be taken. The main steps involve using a normal mixture model to cluster samples into different groups, followed by simple kernel-based approaches to maximize information obtained from read-depth and split-read approaches, after which common breakpoints of groups are inferred. The pipeline uses split-read information directly from CIGAR strings of BAM files, without using a re-alignment step. On simulated data sets, it was able to report breakpoints for very low-coverage samples including those for which only single-end reads were available. When applied to three loci from existing human resequencing data sets (NEGR1, LCE3, IRGM) the pipeline obtained good concordance with results from the 1000 Genomes Project (92, 100, and 82%, respectively). The package is available at https://github.com/hoangtn/SRBreak , and also as a docker-based application at https://registry.hub.docker.com/u/hoangtn/srbreak/ .

机译：拷贝数变异（CNV）与复杂疾病的风险增加相关。高通量测序（HTS）技术有助于检测拷贝数可变区（CNVR）及其断点。这有助于理解基因组结构及其进化过程。已经提出了用于检测CNV断点的各种方法，但是当前对于基于单一分析方法的工具来识别CNV的断点仍然是挑战。但是，已经表明，集成了多种方法的管道能够报告更可靠的断点。在这里，基于HTS数据，我们开发了一条管道来识别与特定CNVR中的不同祖先事件相关的近似断点（±10 bp）。流水线使用来自多个样本的信息将读取深度和拆分读取的信息组合起来以推断断点，从而允许采用插补方法。主要步骤包括使用正常的混合模型将样本聚类为不同的组，然后使用基于内核的简单方法来最大化从读取深度和拆分读取方法获得的信息，然后推断出组的常见断点。管道直接使用BAM文件的CIGAR字符串中的拆分读取信息，而无需使用重新对齐步骤。在模拟数据集上，它能够报告极低覆盖率的样本的断点，包括仅单端读取的样本。当将其应用于现有人类重测序数据集的三个基因座（NEGR1，LCE3，IRGM）时，该管线与1000个基因组计划的结果（分别为92％，100％和82％）获得了很好的一致性。该软件包可从https://github.com/hoangtn/SRBreak获得，也可以在https://registry.hub.docker.com/u/hoangtn/srbreak/上作为基于Docker的应用程序获得。

著录项

来源
《Frontiers in Genetics》 |2016年第1期|共14页
作者
Hoang T. Nguyen; James Boocock; Tony R. Merriman; Michael A. Black;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类遗传学;
关键词

相似文献

外文文献
中文文献
专利

1. SRBreak: A Read-Depth and Split-Read Framework to Identify Breakpoints of Different Events Inside Simple Copy-Number Variable Regions [J] . Nguyen, Hoang T. Frontiers in Genetics . 2016,第2期

机译：SRBreak：一个读取深度和拆分读取框架，用于识别简单拷贝数可变区域内不同事件的断点
2. Copy-number variations in Y-chromosomal azoospermia factor regions identified by multiplex ligation-dependent probe amplification [J] . Kazuki Saito, Mami Miyado, Yoshitomo Kobori, Journal of human genetics . 2015,第3期

机译：通过多重连接依赖性探针扩增鉴定的Y-染色体偶氮孢子症因子区域的复制数变化
3. Characterization of breakpoint regions of large structural autosomal mosaic events [J] . Mitchell JMachiela, LeaJessop, WeiyinZhou, Human Molecular Genetics . 2017,第22期

机译：大型结构常染色体马赛克事件断点区域的表征
4. 2-Simplex Mapping for Identifying the Protein Coding Regions in DNA [C] . Durga Ganesh Grandhi, C. Vijay Kumar IEEE Region 10 Conference . 2007

机译：2-simplex映射，用于鉴定DNA中的蛋白质编码区
5. A framework for identifying appropriate sub-regions for Ecosystem-Based Management in Northern Gulf of Mexico coastal and marine environments. [D] . Ziegler, Jennifer Sloan. 2013

机译：确定墨西哥北部湾沿海和海洋环境中基于生态系统管理的适当分区的框架。
6. SRBreak: A Read-Depth and Split-Read Framework to Identify Breakpoints of Different Events Inside Simple Copy-Number Variable Regions [O] . Hoang T. Nguyen, James Boocock, Tony R. Merriman, 2016

机译：SRBreak：读取深度和拆分读取框架用于识别简单拷贝数可变区域内不同事件的断点
7. SRBreak: A read-depth and split-read framework to identify breakpoints of different events inside simple copy-number variable regions [O] . HOANG T NGUYEN, James Boocock, Tony R Merriman, 2016

机译：sRBreak：读取深度和拆分读取框架，用于识别简单拷贝数可变区域内不同事件的断点
8. Identifying Nutrient Reference Sites in Nutrient-Enriched Regions: Using Algal, Invertebrate, and Fish-Community Measures to Identify Stressor-Breakpoint Thresholds in Indiana Rivers and Streams, 2005-9. [R] . Caskey, B. J., Bunch, A. R., Shoda, M. E., 2012

机译：确定营养丰富地区的营养参考地点：使用藻类，无脊椎动物和鱼类群落措施识别印第安纳河流和溪流中的压力源 - 断点阈值，2005-9。

SRBreak: A Read-Depth and Split-Read Framework to Identify Breakpoints of Different Events Inside Simple Copy-Number Variable Regions

摘要

著录项

相似文献

相关主题

期刊订阅