...
首页> 外文期刊>Molecular ecology resources >PRGmatic: An efficient pipeline for collating genome-enriched second-generation sequencing data using a 'provisional-reference genome'
【24h】

PRGmatic: An efficient pipeline for collating genome-enriched second-generation sequencing data using a 'provisional-reference genome'

机译:PRGmatic:使用“临时参考基因组”整理富含基因组的第二代测序数据的有效管道

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Second-generation sequencing is increasingly being used in combination with genome-enrichment techniques to amplify a large number of loci in many individuals for the purpose of population genetic and phylogeographic analysis. Compiling all the necessary tools to analyse these data is complex and time-consuming. Here, we assemble a set of programs and pipe them together with Perl, enabling research laboratories without a dedicated bioinformatician to utilize second-generation sequencing. User input is a folder of the second-generation sequencing reads sorted by individual (in FASTA format) and pipeline output is a folder of multi-FASTA files that correspond to loci (with 2 alleles called per individual). Additional output includes a summary file of the number of individuals per locus, observed and expected heterozygosity for each locus, distribution of multiple hits and summary statistics (θ, Tajima's D, etc.). This user-friendly, open source pipeline, which requires no a priori reference genome because it constructs its own, allows the user to set various parameters (e.g. minimum coverage) in the dependent programs (CAP3, BWA, SAMtools and VarScan) and facilitates evaluation of the nature and quality of data collected prior to analysis in software packages.
机译:第二代测序正越来越多地与基因组富集技术结合使用,以扩增许多个体中的大量基因座,以进行群体遗传和系统地理分析。编译所有必要的工具来分析这些数据既复杂又费时。在这里,我们组装了一套程序,并将它们与Perl结合在一起,使研究实验室无需专门的生物信息学家即可利用第二代测序。用户输入是按个体分类(FASTA格式)的第二代测序读数的文件夹,而管道输出是对应于基因座的多FASTA文件的文件夹(每个个体称为2个等位基因)。附加输出包括一个摘要文件,其中包含每个基因座的个体数,每个基因座的观察到的和预期的杂合性,多个匹配的分布以及摘要统计信息(θ,Tajima D等)。这种用户友好的开放源代码管道无需先验参考基因组,因为它构建了自己的基因组,允许用户在相关程序(CAP3,BWA,SAMtools和VarScan)中设置各种参数(例如最小覆盖率),并有助于评估软件包中分析之前收集的数据的性质和质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号