...
首页> 外文期刊>Journal of computational biology: A journal of computational molecular cell biology >Statistical Comparison Framework and Visualization Scheme for Ranking-Based Algorithms in High-Throughput Genome-Wide Studies
【24h】

Statistical Comparison Framework and Visualization Scheme for Ranking-Based Algorithms in High-Throughput Genome-Wide Studies

机译:高通量基因组研究中基于排名的算法的统计比较框架和可视化方案

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

As a first step in analyzing high-throughput data in genome-wide studies, several algorithms are available to identify and prioritize candidates lists for downstream fine-mapping. The prioritized candidates could be differentially expressed genes, aberrations in comparative genomics hybridization studies, or single nucleotide polymorphisms (SNPs) in association studies. Different analysis algorithms are subject to various experimental artifacts and analytical features that lead to different candidate lists. However, little research has been carried out to theoretically quantify the consensus between different candidate lists and to compare the study specific accuracy of the analytical methods based on a known reference candidate list. Within the context of genome-wide studies, we propose a generic mathematical framework to statistically compare ranked lists of candidates from different algorithms with each other or, if available, with a reference candidate list. To cope with the growing need for intuitive visualization of high-throughput data in genome-wide studies, we describe a complementary customizable visualization tool. As a case study, we demonstrate application of our framework to the comparison and visualization of candidate lists generated in a DNA-pooling based genome-wide association study of CEPH data in the HapMap project, where prior knowledge from individual geno-typing can be used to generate a true reference candidate list. The results provide a theoretical basis to compare the accuracy of various methods and to identify redundant methods, thus providing guidance for selecting the most suitable analysis method in genome-wide studies.
机译:在全基因组研究中分析高通量数据的第一步,可以使用多种算法来识别候选列表并确定其优先级,以进行下游精细映射。优先考虑的候选者可能是差异表达的基因,比较基因组杂交研究中的畸变或关联研究中的单核苷酸多态性(SNP)。不同的分析算法受制于各种实验工件和分析特征,从而导致候选列表不同。但是,很少进行研究来从理论上量化不同候选列表之间的共识并比较基于已知参考候选列表的分析方法的研究特定准确性。在全基因组研究的背景下,我们提出了一个通用的数学框架,以统计学方式将来自不同算法的候选者的排名列表彼此进行比较,或者将它们与参考候选者列表进行统计比较。为了满足在全基因组研究中对高通量数据进行直观可视化的日益增长的需求,我们描述了一种互补的可定制可视化工具。作为一个案例研究,我们展示了我们的框架在HapMap项目中基于DNA池的CEPH数据基于DNA池的全基因组关联研究中生成的候选列表的比较和可视化中的应用,其中可以使用来自个体基因分型的先验知识生成真正的参考候选人名单。结果为比较各种方法的准确性和鉴定冗余方法提供了理论基础,从而为选择全基因组研究中最合适的分析方法提供了指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号