...
首页> 外文期刊>British Biotechnology Journal >Current Opportunities and Challenges of Next Generation Sequencing (NGS) of DNA; Determining Health and Diseases
【24h】

Current Opportunities and Challenges of Next Generation Sequencing (NGS) of DNA; Determining Health and Diseases

机译:DNA下一代测序(NGS)的当前机遇和挑战;确定健康和疾病

获取原文

摘要

Many publications have demonstrated the huge potential of NGS methods in terms of new species discovery, environment monitoring, ecological studies, etc. [24,35,92,97,103]. Undoubtedly, NGS will become one the major tools for species identification and for routine diagnostic use. While read lengths are still quite short for most existing systems ranging between 50 bp and 800 bp, they are likely to improve soon. This will enable easier, faster, and more reliable contig assembly and subsequent matching against reference databases. When data generation is no longer a bottleneck, the storage, speed of analysis, and interpretation of DNA sequence data are becoming the major challenges. Also, the integration or the use of data originating from diverse datasets and a variety of data providers are serious issues that need to be addressed. Poor sequence record annotations and species name assignments are known problems that should be instantly addressed and would allow the creation of reference databases used for routine diagnostics based on NGS. Samples with huge amounts of short DNA fragments need to be analyzed and compared against reference databases in an efficient and fast way. Although a number of solutions have been proposed by Industry; offering commercial software, there still remain hurdles to take. One of the challenges that we need to address is data upload from client’s computers to central or distributed data storage and analysis services. Another one is the efficient parallelization of analyses using cloud or grid solutions. The reliability and up-time of storage and analyses facilities is another important problem that need to be addressed if one wants to use it for routine diagnostics. Finally, the management, reporting and visualization of the analyses results are among the last issues, but not the least challenging ones. Considering the constant growth of computational power and storage capacity needed by different bioinformatics applications, working with single or a limited number of servers is no longer realistic. Using a cloud environment and grid computing is becoming a must. Even single cloud service provider can be restrictive for bioinformatics applications and working with more than one cloud can make the workflow more robust in the face of failures and always growing capacity needs. In this white paper we review the current state of the art in this field. We discuss the main limitations and challenges that we need to address such as; data upload from client’s computers to central or distributed data storage and analysis services; efficient parallelization of analyses using grid solutions; reliability and up-time of storage and analyses facilities for routine diagnostics; management, retrieving and visualization of the analyses results.
机译:许多出版物已经证明了NGS方法在新物种发现,环境监测,生态研究等方面的巨大潜力[24,35,92,97,103]。毫无疑问,NGS将成为用于物种鉴定和常规诊断的主要工具之一。尽管大多数现有系统的读取长度仍很短,介于50 bp和800 bp之间,但它们可能很快会有所改善。这将使组装过程更容易,更快和更可靠,并随后与参考数据库进行匹配。当数据生成不再成为瓶颈时,DNA序列数据的存储,分析速度和解释正成为主要挑战。同样,来自不同数据集和各种数据提供者的数据的集成或使用也是需要解决的严重问题。不良的序列记录注释和物种名称分配是已知问题,应立即解决,并允许基于NGS创建用于常规诊断的参考数据库。需要以快速有效的方式分析具有大量短DNA片段的样品,并将其与参考数据库进行比较。尽管工业界已经提出了许多解决方案。提供商业软件,仍然有障碍。我们需要解决的挑战之一是将数据从客户端计算机上传到中央或分布式数据存储和分析服务。另一个是使用云或网格解决方案进行分析的有效并行化。如果要使用它进行常规诊断,则存储和分析设施的可靠性和正常运行时间是另一个必须解决的重要问题。最后,分析结果的管理,报告和可视化是最后一个问题,但挑战性不是最小。考虑到不同生物信息学应用程序所需要的计算能力和存储容量的不断增长,使用单个或数量有限的服务器工作已不再现实。使用云环境和网格计算已成为必须。即使是单个云服务提供商,也可能会限制生物信息学应用程序,并且与多个云一起使用可使面对失败和不断增长的容量需求的工作流更加健壮。在本白皮书中,我们回顾了该领域的最新技术。我们讨论了需要解决的主要限制和挑战,例如;数据从客户计算机上传到中央或分布式数据存储和分析服务;使用网格解决方案进行有效的分析并行化;日常诊断的存储和分析设施的可靠性和正常运行时间;分析结果的管理,检索和可视化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号