...
首页> 外文期刊>Bioinformatics >Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data
【24h】

Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data

机译:Hi-Corrector:一种快速,可扩展且高效存储的软件包,用于规范大规模Hi-C数据

获取原文
获取原文并翻译 | 示例
           

摘要

Genome-wide proximity ligation assays, e.g. Hi-C and its variant TCC, have recently become important tools to study spatial genome organization. Removing biases from chromatin contact matrices generated by such techniques is a critical preprocessing step of subsequent analyses. The continuing decline of sequencing costs has led to an ever-improving resolution of the Hi-C data, resulting in very large matrices of chromatin contacts. Such large-size matrices, however, pose a great challenge on the memory usage and speed of its normalization. Therefore, there is an urgent need for fast and memory-efficient methods for normalization of Hi-C data. We developed Hi-Corrector, an easy-to-use, open source implementation of the Hi-C data normalization algorithm. Its salient features are (i) scalability-the software is capable of normalizing Hi-C data of any size in reasonable times; (ii) memory efficiency-the sequential version can run on any single computer with very limited memory, no matter how little; (iii) fast speed-the parallel version can run very fast on multiple computing nodes with limited local memory.
机译:全基因组邻近连接测定,例如Hi-C及其变体TCC最近已成为研究空间基因组组织的重要工具。从通过这种技术产生的染色质接触基质中去除偏差是后续分析的关键预处理步骤。测序成本的持续下降导致Hi-C数据的分辨率不断提高,导致染色质接触矩阵非常大。但是,这样的大型矩阵对内存使用及其规范化速度提出了巨大挑战。因此,迫切需要用于Hi-C数据的标准化的快速且有效存储的方法。我们开发了Hi-Corrector,这是Hi-C数据标准化算法的易于使用的开源实现。它的显着特征是:(i)可扩展性-该软件能够在合理的时间内对任何大小的Hi-C数据进行规范化; (ii)内存效率-顺序版本可以在内存非常有限的任何一台计算机上运行,​​无论数量多少; (iii)快速-并行版本可以在本地内存有限的多个计算节点上非常快速地运行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号