首页> 外文期刊>BMC Genomics >CGHScan: finding variable regions using high-density microarray comparative genomic hybridization data
【24h】

CGHScan: finding variable regions using high-density microarray comparative genomic hybridization data

机译:CGHScan:使用高密度微阵列比较基因组杂交数据查找可变区

获取原文
获取外文期刊封面目录资料

摘要

Background Comparative genomic hybridization can rapidly identify chromosomal regions that vary between organisms and tissues. This technique has been applied to detecting differences between normal and cancerous tissues in eukaryotes as well as genomic variability in microbial strains and species. The density of oligonucleotide probes available on current microarray platforms is particularly well-suited for comparisons of organisms with smaller genomes like bacteria and yeast where an entire genome can be assayed on a single microarray with high resolution. Available methods for analyzing these experiments typically confine analyses to data from pre-defined annotated genome features, such as entire genes. Many of these methods are ill suited for datasets with the number of measurements typical of high-density microarrays. Results We present an algorithm for analyzing microarray hybridization data to aid identification of regions that vary between an unsequenced genome and a sequenced reference genome. The program, CGHScan, uses an iterative random walk approach integrating multi-layered significance testing to detect these regions from comparative genomic hybridization data. The algorithm tolerates a high level of noise in measurements of individual probe intensities and is relatively insensitive to the choice of method for normalizing probe intensity values and identifying probes that differ between samples. When applied to comparative genomic hybridization data from a published experiment, CGHScan identified eight of nine known deletions in a Brucella ovis strain as compared to Brucella melitensis. The same result was obtained using two different normalization methods and two different scores to classify data for individual probes as representing conserved or variable genomic regions. The undetected region is a small (58 base pair) deletion that is below the resolution of CGHScan given the array design employed in the study. Conclusion CGHScan is an effective tool for analyzing comparative genomic hybridization data from high-density microarrays. The algorithm is capable of accurately identifying known variable regions and is tolerant of high noise and varying methods of data preprocessing. Statistical analysis is used to define each variable region providing a robust and reliable method for rapid identification of genomic differences independent of annotated gene boundaries.
机译:背景技术比较基因组杂交可以快速鉴定在生物体和组织之间变化的染色体区域。该技术已应用于检测真核生物正常组织与癌变组织之间的差异以及微生物菌株和物种的基因组变异性。当前微阵列平台上可用的寡核苷酸探针的密度特别适合比较具有较小基因组的生物,如细菌和酵母,其中可以在单个微阵列上以高分辨率测定整个基因组。用于分析这些实验的可用方法通常将分析限制在来自预定义的注释基因组特征(例如整个基因)的数据中。其中许多方法不适用于具有高密度微阵列典型测量次数的数据集。结果我们提出了一种用于分析微阵列杂交数据的算法,以帮助鉴定在未测序基因组和测序参考基因组之间变化的区域。 CGHScan程序使用迭代随机游走方法,该方法集成了多层显着性测试,可以从比较基因组杂交数据中检测这些区域。该算法在单个探针强度的测量中可以承受较高的噪声水平,并且对选择标准化探针强度值和识别样品之间不同的探针的方法相对不敏感。当将CGHScan应用于来自已发表实验的比较基因组杂交数据时,与布鲁氏菌相比,在布鲁氏肉眼菌株中鉴定出九个已知缺失中的八个。使用两种不同的归一化方法和两种不同的分数将单个探针的数据分类为代表保守或可变基因组区域时,可获得相同的结果。鉴于研究中采用的阵列设计,未检测到的区域是一个小的(58个碱基对)缺失,低于CGHScan的分辨率。结论CGHScan是分析来自高密度微阵列的比较基因组杂交数据的有效工具。该算法能够准确地识别已知的可变区域,并且能够承受高噪声和数据预处理的各种方法。统计分析用于定义每个可变区,从而提供了一种可靠且可靠的方法来快速识别与注释的基因边界无关的基因组差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号