首页> 美国卫生研究院文献>Bioinformatics >Fast randomization of large genomic datasets while preserving alteration counts

【2h】

Fast randomization of large genomic datasets while preserving alteration counts

机译：快速随机化大型基因组数据集同时保留变异数

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

>Motivation: Studying combinatorial patterns in cancer genomic datasets has recently emerged as a tool for identifying novel cancer driver networks. Approaches have been devised to quantify, for example, the tendency of a set of genes to be mutated in a ‘mutually exclusive’ manner. The significance of the proposed metrics is usually evaluated by computing P-values under appropriate null models. To this end, a Monte Carlo method (the switching-algorithm) is used to sample simulated datasets under a null model that preserves patient- and gene-wise mutation rates. In this method, a genomic dataset is represented as a bipartite network, to which Markov chain updates (switching-steps) are applied. These steps modify the network topology, and a minimal number of them must be executed to draw simulated datasets independently under the null model. This number has previously been deducted empirically to be a linear function of the total number of variants, making this process computationally expensive.>Results: We present a novel approximate lower bound for the number of switching-steps, derived analytically. Additionally, we have developed the R package BiRewire, including new efficient implementations of the switching-algorithm. We illustrate the performances of BiRewire by applying it to large real cancer genomics datasets. We report vast reductions in time requirement, with respect to existing implementations/bounds and equivalent P-value computations. Thus, we propose BiRewire to study statistical properties in genomic datasets, and other data that can be modeled as bipartite networks.>Availability and implementation: BiRewire is available on BioConductor at >Contact: >Supplementary information: are available at Bioinformatics online.

机译：>动机：最近，研究癌症基因组数据集中的组合模式成为一种识别新型癌症驱动程序网络的工具。例如，已经设计出了一些方法来量化一组基因以“互斥”方式突变的趋势。通常通过在适当的空模型下计算P值来评估建议指标的重要性。为此，在保留患者和基因突变率的无效模型下，使用蒙特卡罗方法（转换算法）对模拟数据集进行采样。在这种方法中，基因组数据集表示为两部分网络，对其应用了马尔可夫链更新（切换步骤）。这些步骤修改了网络拓扑，必须执行最少的步骤才能在null模型下独立绘制模拟数据集。先前已根据经验推断此数字为变量总数的线性函数，从而使该过程的计算量很大。>结果：我们为切换步骤的数量提供了一种新颖的近似下限，得出分析地。此外，我们还开发了R软件包BiRewire，其中包括交换算法的新有效实现。我们通过将BiRewire应用于大型真实癌症基因组数据集来说明其性能。我们报告，相对于现有的实现/范围和等效的P值计算，时间要求大大减少了。因此，我们建议使用BiRewire来研究基因组数据集以及可建模为二分网络的其他数据中的统计特性。>可用性和实现：BiRewire在BioConductor上的>联系方式： >补充信息：可在线访问生物信息学。

著录项

期刊名称 Bioinformatics
作者
Andrea Gobbi; Francesco Iorio; Kevin J. Dawson; David C. Wedge; David Tamborero; Ludmil B. Alexandrov; Nuria Lopez-Bigas; Mathew J. Garnett; Giuseppe Jurman; Julio Saez-Rodriguez;
展开▼
作者单位

展开▼
年(卷),期 -1(30),17
年度 -1
页码 i617–i623
总页数 7
原文格式 PDF
正文语种
中图分类应用微生物学;生化遗传学;生化药理学;
关键词

相似文献

外文文献
中文文献
专利

1. Fast randomization of large genomic datasets while preserving alteration counts [J] . Gobbi Andrea, Iorio Francesco, Dawson Kevin J., Bioinformatics . 2014,第17期

机译：大型基因组数据集的快速随机化，同时保留变异数
2. Privacy-preserving GWAS analysis on federated genomic datasets [J] . Scott D Constable, Yuzhe Tang, Shuang Wang, BMC Medical Informatics and Decision Making . 2015,第SUPPLEMENTa5期

机译：联邦基因组数据集的保护隐私的GWAS分析
3. Comparison of TCGA and GENIE genomic datasets for the detection of clinically actionable alterations in breast cancer [J] . Pushpinder Kaur, Tania B. Porras, Alexander Ring, Scientific reports. . 2019,第1期

机译：比较TCGA和GENIE基因组数据集以检测乳腺癌中可临床操作的改变
4. A Fourier-Based Data Minimization Algorithm for Fast and Secure Transfer of Big Genomic Datasets [C] . Mohammed Aledhari, Marianne Di Pierro, Fahad Saeed 2018 IEEE International Congress on Big Data . 2018

机译：基于傅立叶的数据最小化算法，可快速安全地传输大基因组数据集
5. Interactive fast random access, retrieval, and navigation of large datasets [D] . Fan, Zihong 2011

机译：大型数据集的交互式快速随机访问，检索和导航
6. Privacy-preserving GWAS analysis on federated genomic datasets [O] . Scott D Constable, Yuzhe Tang, Shuang Wang, 2015

机译：联邦基因组数据集的保护隐私的GWAS分析
7. Privacy-preserving GWAS analysis on federated genomic datasets [O] . 2015

机译：联邦基因组数据集的保护隐私的GWAS分析

Fast randomization of large genomic datasets while preserving alteration counts

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅