首页> 美国卫生研究院文献>PLoS Clinical Trials >Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies

【2h】

Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies

机译：通过测序（GBS）数据进行基因分型的全基因组SNP调用：七个管道和两种测序技术的比较

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79–92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50–70%).

机译：下一代测序（NGS）已在许多方面革新了动植物研究，包括高通量基因分型的新方法。测序基因分型（GBS）已被证明是一种可靠且具有成本效益的基因分型方法，能够在广泛的物种中产生数千至数百万个SNP。无疑，对其广泛使用的最大障碍是数据分析的挑战。在这里，我们描述了七种GBS生物信息学流水线的全面比较，这些流水线旨在将原始GBS序列数据处理为SNP基因型。我们比较了五个需要参考基因组的管道（TASSEL-GBS v1和v2，Stacks，IGST和Fast-GBS）和两个不需要参考基因组的从头构建管道（UNEAK和Stacks）。使用来自一组24个重新排序的大豆品系的Illumina序列数据，我们对这些管道执行了SNP调用，并将GBS SNP调用与重新测序数据进行了比较，以评估其准确性。没有参考基因组的SNP数量（13k至24k）低于参考基因组（25k至54k SNP），而除一条外，其他所有管线（TASSEL-GBSv1，76.1％）的准确性都很高（92.3至98.7％）。。在提供高精度（> 95％）的管线中，Fast-GBS称为最多的多态性（接近35,000个SNP +插入/缺失），并且产生的精度最高（98.7％）。使用相同24条线的离子激流序列数据，我们比较了Fast-GBS和TASSEL-GBSv2的性能。它再次调用了更多的多态性（25.8K和22.9K），事实证明这些更加准确（95.2对91.1％）。通常，使用不同管线从相同测序数据调用的SNP目录导致高度重叠的SNP目录（重叠79–92％）。相反，使用相同管道但使用不同测序技术获得的SNP目录之间的重叠范围较小（〜50–70％）。

著录项

期刊名称 PLoS Clinical Trials
作者
Davoud Torkamaneh; Jérôme Laroche; François Belzile;
展开▼
作者单位

展开▼
年(卷),期 2011(11),8
年度 2011
页码 e0161333
总页数 14
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data [J] . Davoud Torkamaneh, Jér?me Laroche, Maxime Bastien, BMC Bioinformatics . 2017,第1期

机译：Fast-GBS：从按基因分型的基因数据高效，高精度地调用SNP的新渠道
2. GBS-SNP-CROP: a reference-optional pipeline for SNP discovery and plant germplasm characterization using variable length, paired-end genotyping-by-sequencing data [J] . Arthur T. O. Melo, Radhika Bartaula, Iago Hale BMC Bioinformatics . 2016,第1期

机译：GBS-SNP-CROP：使用可变长度，双末端测序基因分型数据进行SNP发现和植物种质鉴定的参考可选管线
3. Genome-wide identification of single nucleotide polymorphisms (SNPs) and molecular characterization of Prunusrootstock germplasm using a genotyping-by-sequencing (GBS) approach [J] . V. Guajardo, S. Solis, R. Almada, Acta Horticulturae . 2018,第1203期

机译：使用基因分序列（GBS）方法，对单核苷酸多态性（SNPS）（SNP）（SNP）的分子表征和Prunusrootstock种质的分子表征
4. Cluster-Based SNP Calling on Large-Scale Genome Sequencing Data [C] . Kutlu Mucahid, Agrawal Gagan IEEE/ACM international symposium on cluster, cloud and grid computing . 2014

机译：基于簇的SNP调用大规模基因组测序数据
5. Comparison of SNP Calling Tools for RNA Sequencing Data [D] . Li, Xiang. 2018

机译：用于RNA测序数据的SNP调用工具的比较
6. Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data [O] . Davoud Torkamaneh, Jérôme Laroche, Maxime Bastien, 2017

机译：Fast-GBS：一条新的管道用于通过按序列进行基因分型的数据高效高精度地调用SNP
7. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies. [O] . Davoud Torkamaneh, Jérôme Laroche, François Belzile 2016

机译：通过测序（GBs）数据从基因分型调用全基因组sNp：七种管道和两种测序技术的比较。

Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies

摘要

著录项

相似文献

相关主题

期刊订阅