首页> 美国卫生研究院文献>Bioinformatics >Hi-Corrector: a fast scalable and memory-efficient package for normalizing large-scale Hi-C data
【2h】

Hi-Corrector: a fast scalable and memory-efficient package for normalizing large-scale Hi-C data

机译:Hi-Corrector:一种快速可扩展且高效存储的软件包用于标准化大规模Hi-C数据

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Summary: Genome-wide proximity ligation assays, e.g. Hi-C and its variant TCC, have recently become important tools to study spatial genome organization. Removing biases from chromatin contact matrices generated by such techniques is a critical preprocessing step of subsequent analyses. The continuing decline of sequencing costs has led to an ever-improving resolution of the Hi-C data, resulting in very large matrices of chromatin contacts. Such large-size matrices, however, pose a great challenge on the memory usage and speed of its normalization. Therefore, there is an urgent need for fast and memory-efficient methods for normalization of Hi-C data. We developed Hi-Corrector, an easy-to-use, open source implementation of the Hi-C data normalization algorithm. Its salient features are (i) scalability—the software is capable of normalizing Hi-C data of any size in reasonable times; (ii) memory efficiency—the sequential version can run on any single computer with very limited memory, no matter how little; (iii) fast speed—the parallel version can run very fast on multiple computing nodes with limited local memory.>Availability and implementation: The sequential version is implemented in ANSI C and can be easily compiled on any system; the parallel version is implemented in ANSI C with the MPI library (a standardized and portable parallel environment designed for solving large-scale scientific problems). The package is freely available at .>Contact: or >Supplementary information: are available at Bioinformatics online.
机译:>摘要:全基因组邻近连接测定,例如Hi-C及其变体TCC最近已成为研究空间基因组组织的重要工具。从通过这种技术产生的染色质接触基质中去除偏差是后续分析的关键预处理步骤。测序成本的持续下降导致Hi-C数据的分辨率不断提高,导致染色质接触矩阵非常大。但是,这样的大型矩阵对内存使用及其规范化速度提出了巨大挑战。因此,迫切需要用于Hi-C数据的标准化的快速且有效存储的方法。我们开发了Hi-Corrector,这是Hi-C数据标准化算法的易于使用的开源实现。它的显着特征是(i)可扩展性-该软件能够在合理的时间内对任何大小的Hi-C数据进行规范化; (ii)内存效率-顺序版本可以在内存非常有限的任何一台计算机上运行,​​无论内存多少; (iii)速度快-并行版本可以在本地内存有限的多个计算节点上非常快地运行。>可用性和实现:顺序版本在ANSI C中实现,可以在任何系统上轻松编译;并行版本是使用MPI库(用于解决大规模科学问题的标准化可移植并行环境)在ANSI C中实现的。该软件包可从以下站点免费获得。>联系方式:或>补充信息:可从在线生物信息学获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号