首页> 美国卫生研究院文献>Database: The Journal of Biological Databases and Curation >CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects
【2h】

CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects

机译:CanvasDB:用于分析目标基因组和整个基因组重测序项目的本地数据库基础结构

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server.>Database URL:
机译:CanvasDB是用于管理和分析大规模并行测序(MPS)项目中遗传变异的基础结构。该系统将SNP和indel调用存储在本地数据库中,该数据库旨在处理非常大的数据集,以允许使用R中的简单命令进行快速分析。系统中包含功能注释,使其适用于直接识别中的致病突变人类外显子组(WES)或全基因组测序(WGS)项目。该系统具有内置的过滤功能,可同时考虑来自所有单个样本的变量调用。这可以对样本组之间的变异分布进行高级比较分析,包括通过测序检测家族结构内候选致病突变和全基因组关联。在大多数情况下,即使数据库中有数百个样本和数百万个变体,这些分析也只需几秒钟即可完成。我们通过将1000个基因组计划中存在的所有1092个个体的单个变体调用导入到系统中来展示canvasDB的可伸缩性,总共超过44亿个SNP和indel。我们的结果表明canvasDB使得可以在本地服务器上对大型WGS项目执行高级分析。>数据库URL:

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号