首页> 外文期刊>Bioinformatics >A faster circular binary segmentation algorithm for the analysis of array CGH data
【24h】

A faster circular binary segmentation algorithm for the analysis of array CGH data

机译:一种更快的圆形二进制分割算法,用于分析阵列CGH数据

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number. The algorithm tests for change-points using a maximal t-statistic with a permutation reference distribution to obtain the corresponding P-value. The number of computations required for the maximal test statistic is O(N~2), where N is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster algorithm. Results: We present a hybrid approach to obtain the P-value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analyses of array CGH data from breast cancer cell lines to show the impact of the new approaches on the analysis of real data.
机译:动机:阵列CGH技术可同时测量基因组上数千个位点的DNA拷贝数。我们开发了循环二进制分段(CBS)算法,将基因组分为相等拷贝数的区域。该算法使用最大t统计量和置换参考分布测试变化点,以获得相应的P值。最大检验统计量所需的计算次数为O(N〜2),其中N是标记数。这使得完全置换方法在计算上无法用于包含成千上万个标记的较新数组,并突出显示了对更快算法的需求。结果:我们提出了一种混合方法来获得线性时间的检验统计量的P值。我们还引入了一条规则,即在有确凿证据表明存在变更时尽早停止。通过仿真我们可以看出,混合方法可显着提高速度,而精度损失可忽略不计,而停止规则可进一步提高速度。我们还介绍了来自乳腺癌细胞株的阵列CGH数据的分析,以显示新方法对真实数据分析的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号