首页> 美国卫生研究院文献>Frontiers in Genetics >Integrated Systems for NGS Data Management and Analysis: Open Issues and Available Solutions
【2h】

Integrated Systems for NGS Data Management and Analysis: Open Issues and Available Solutions

机译:用于NGS数据管理和分析的集成系统:未解决的问题和可用的解决方案

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Next-generation sequencing (NGS) technologies have deeply changed our understanding of cellular processes by delivering an astonishing amount of data at affordable prices; nowadays, many biology laboratories have already accumulated a large number of sequenced samples. However, managing and analyzing these data poses new challenges, which may easily be underestimated by research groups devoid of IT and quantitative skills. In this perspective, we identify five issues that should be carefully addressed by research groups approaching NGS technologies. In particular, the five key issues to be considered concern: (1) adopting a laboratory management system (LIMS) and safeguard the resulting raw data structure in downstream analyses; (2) monitoring the flow of the data and standardizing input and output directories and file names, even when multiple analysis protocols are used on the same data; (3) ensuring complete traceability of the analysis performed; (4) enabling non-experienced users to run analyses through a graphical user interface (GUI) acting as a front-end for the pipelines; (5) relying on standard metadata to annotate the datasets, and when possible using controlled vocabularies, ideally derived from biomedical ontologies. Finally, we discuss the currently available tools in the light of these issues, and we introduce HTS-flow, a new workflow management system conceived to address the concerns we raised. HTS-flow is able to retrieve information from a LIMS database, manages data analyses through a simple GUI, outputs data in standard locations and allows the complete traceability of datasets, accompanying metadata and analysis scripts.
机译:下一代测序(NGS)技术以可承受的价格提供了惊人数量的数据,从而极大地改变了我们对细胞过程的理解。如今,许多生物学实验室已经积累了大量测序样品。但是,管理和分析这些数据带来了新的挑战,而缺乏IT和定量技能的研究小组很容易低估了这些挑战。从这个角度出发,我们确定了研究NGS技术的研究小组应认真解决的五个问题。特别是要考虑的五个关键问题:(1)采用实验室管理系统(LIMS)并在下游分析中保护由此产生的原始数据结构; (2)即使在同一数据上使用多个分析协议,也要监视数据流并标准化输入和输出目录和文件名; (3)确保所进行分析的完全可追溯性; (4)使无经验的用户能够通过充当管道前端的图形用户界面(GUI)进行分析; (5)依靠标准元数据来注释数据集,并在可能的情况下使用受控词汇,这些词汇最好是从生物医学本体中得出的。最后,针对这些问题,我们讨论了当前可用的工具,并介绍了HTS-flow,这是一种新的工作流管理系统,旨在解决我们提出的问题。 HTS-flow能够从LIMS数据库检索信息,通过简单的GUI管理数据分析,在标准位置输出数据,并允许数据集,随附的元数据和分析脚本的完全可追溯性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号