首页> 美国卫生研究院文献>Bioinformatics >The Sleipnir library for computational functional genomics
【2h】

The Sleipnir library for computational functional genomics

机译:用于计算功能基因组学的Sleipnir库

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: Biological data generation has accelerated to the point where hundreds or thousands of whole-genome datasets of various types are available for many model organisms. This wealth of data can lead to valuable biological insights when analyzed in an integrated manner, but the computational challenge of managing such large data collections is substantial. In order to mine these data efficiently, it is necessary to develop methods that use storage, memory and processing resources carefully.>Results: The Sleipnir C++ library implements a variety of machine learning and data manipulation algorithms with a focus on heterogeneous data integration and efficiency for very large biological data collections. Sleipnir allows microarray processing, functional ontology mining, clustering, Bayesian learning and inference and support vector machine tasks to be performed for heterogeneous data on scales not previously practical. In addition to the library, which can easily be integrated into new computational systems, prebuilt tools are provided to perform a variety of common tasks. Many tools are multithreaded for parallelization in desktop or high-throughput computing environments, and most tasks can be performed in minutes for hundreds of datasets using a standard personal computer.>Availability: Source code (C++) and documentation are available at and compiled binaries are available from the authors on request.>Contact:
机译:>动机:生物数据的产生已经加速到数百个或数千个各种类型的全基因组数据集可用于许多模型生物的地步。如果以综合方式进行分析,那么大量的数据可以带来有价值的生物学见解,但是管理如此大的数据集在计算上面临着巨大的挑战。为了有效地挖掘这些数据,有必要开发一种使用存储,内存和处理资源的方法。>结果:Sleipnir C ++库实现了多种机器学习和数据处理算法,重点是关于非常大的生物数据收集的异构数据集成和效率。 Sleipnir允许以以前不可行的规模对异构数据执行微阵列处理,功能本体挖掘,聚类,贝叶斯学习和推理以及支持向量机任务。除了可以轻松集成到新的计算系统中的库之外,还提供了预建工具来执行各种常见任务。许多工具都具有多线程功能,可在台式机或高吞吐量计算环境中实现并行化,并且大多数任务可以使用标准的个人计算机在几分钟内对数百个数据集执行。>可用性:源代码(C ++)和文档是可以从以下网站获得,也可以根据要求从作者那里获得编译后的二进制文件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号