An automated infrastructure to support high-throughput bioinformatics

机译：一种支持高吞吐量生物信息学的自动基础架构

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The number of domains affected by the big data phenomenon is constantly increasing, both in science and industry, with high-throughput DNA sequencers being among the most massive data producers. Building analysis frameworks that can keep up with such a high production rate, however, is only part of the problem: current challenges include dealing with articulated data repositories where objects are connected by multiple relationships, managing complex processing pipelines where each step depends on a large number of configuration parameters and ensuring reproducibility, error control and usability by non-technical staff. Here we describe an automated infrastructure built to address the above issues in the context of the analysis of the data produced by the CRS4 next-generation sequencing facility. The system integrates open source tools, either written by us or publicly available, into a framework that can handle the whole data transformation process, from raw sequencer output to primary analysis results.

机译：受重大数据现象影响的域名在科学和工业中不断增加，具有高通量DNA序列序列是最巨大的数据生产商之一。尽管如此，可以跟上这种高生产率的建立分析框架只是问题的一部分：当前挑战包括处理对象通过多个关系连接的铰接数据存储库，管理每个步骤的复杂处理流水线，其中每个步骤都取决于大量配置参数数量，并确保非技术人员的再现性，错误控制和可用性。在这里，我们描述了一种自动化基础设施，以解决上述问题的上述问题，在分析CRS4下一代测序设施的数据的分析。该系统集成了由我们或公开可用的开源工具，进入可以处理整个数据转换过程的框架，从RAW测序器输出到主要分析结果。

著录项

来源
《International Conference on High Performance Computing Simulation》|2014年||共8页
会议地点
作者
Cuccuru Gianmauro; Leo Simone; Lianas Luca; Muggiri Michele; Pinna Andrea; Pireddu Luca; Uva Paolo; Angius Andrea; Fotia Giorgio; Zanetti Gianluigi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类一般性问题;
关键词
Bioinformatics; Genomics; Muscles; Simple object access protocol; Bioinformatics; MapReduce; NGS;

机译：生物信息学;基因组学;肌肉;简单对象访问协议;生物信息学;的MapReduce;NGS;

相似文献

外文文献
中文文献
专利

1. SSR-pipeline: A bioinformatic infrastructure for identifying microsatellites from paired-end illumina high-throughput DNA sequencing data [J] . MillerM.P., KnausB.J., MullinsT.D., The Journal of Heredity . 2013,第6期

机译：SSR管线：一种生物信息学基础设施，用于从成对末端照明高通量DNA测序数据中鉴定微卫星
2. Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput "omics" Data [J] . Churning Chen, Peter B. McGarvey, Hongzhan Huang, Advances in Bioinformatics . 2010,第Null期

机译：蛋白质生物信息学基础设施，用于集成和分析多个高通量“组学”数据
3. Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput “omics” Data [J] . ChumingChen, Peter B.McGarvey, HongzhanHuang, Advances in Bioinformatics . 2010,第1期

机译：用于多个高通量“组学”数据集成和分析的蛋白质生物信息学基础设施
4. An automated infrastructure to support high-throughput bioinformatics [C] . Cuccuru Gianmauro, Leo Simone, Lianas Luca, International Conference on High Performance Computing Simulation . 2014

机译：支持高通量生物信息学的自动化基础架构
5. Statistical Algorithms and Bioinformatics Tools Development for Computational Analysis of High-throughput Transcriptomic Data [D] . McDermaid, Adam. 2018

机译：高吞吐量转录组数据计算分析的统计算法和生物信息学工具开发
6. Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput omics Data [O] . Chuming Chen, Peter B. McGarvey, Hongzhan Huang, 2010

机译：用于多个高通量组学数据的集成和分析的蛋白质生物信息学基础设施
7. Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput “omics” Data [O] . Chen, Chuming, McGarvey, Peter B., Huang, Hongzhan, 2010

机译：用于多个高通量“组学”数据的集成和分析的蛋白质生物信息学基础设施

An automated infrastructure to support high-throughput bioinformatics

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅