...
首页> 外文期刊>PLoS Computational Biology >Comprehensive analysis of cancer breakpoints reveals signatures of genetic and epigenetic contribution to cancer genome rearrangements
【24h】

Comprehensive analysis of cancer breakpoints reveals signatures of genetic and epigenetic contribution to cancer genome rearrangements

机译:癌症断裂点综合分析显示遗传和表观遗传贡献对癌症基因组重排的差异

获取原文
           

摘要

Understanding mechanisms of cancer breakpoint mutagenesis is a difficult task and predictive models of cancer breakpoint formation have to this time failed to achieve even moderate predictive power. Here we take advantage of a machine learning approach that can gather important features from big data and quantify contribution of different factors. We performed comprehensive analysis of almost 630,000 cancer breakpoints and quantified the contribution of genomic and epigenomic features–non-B DNA structures, chromatin organization, transcription factor binding sites and epigenetic markers. The results showed that transcription and formation of non-B DNA structures are two major processes responsible for cancer genome fragility. Epigenetic factors, such as chromatin organization in TADs, open/closed regions, DNA methylation, histone marks are less informative but do make their contribution. As a general trend, individual features inside the groups show a relatively high contribution of G-quadruplexes and repeats and CTCF, GABPA, RXRA, SP1, MAX and NR2F2 transcription factors. Overall, the cancer breakpoint landscape can be represented by well-predicted hotspots and poorly predicted individual breakpoints scattered across genomes. We demonstrated that hotspot mutagenesis has genomic and epigenomic factors, and not all individual cancer breakpoints are just random noise but have a definite mutation signature. Besides we found a long-range action of some features on breakpoint mutagenesis. Combining omics data, cancer-specific individual feature importance and adding the distant to local features, predictive models for cancer breakpoint formation achieved 70–90% ROC AUC for different cancer types; however precision remained low at 2% and the recall did not exceed 50%. On the one hand, the power of models strongly correlates with the size of available cancer breakpoint and epigenomic data, and on the other hand finding strong determinants of cancer breakpoint formation still remains a challenge. The strength of predictive signals of each group and of each feature inside a group can be converted into cancer-specific breakpoint mutation signatures. Overall our results add to the understanding of cancer genome rearrangement processes.
机译:癌症断点突变的理解机制,是一项艰巨的任务和癌症形成断点预测模型必须这段时间未能达到甚至适度的预测能力。下面我们就一机器学习的方法,可以收集来自大数据和不同因素的贡献进行量化的重要特征的优势。我们进行了将近63万癌症断点的综合分析和量化的基因组与表观基因组的功能,非B DNA结构,染色质组织,转录因子结合位点和表观遗传标记的贡献。结果显示非B DNA结构的转录和形成负责癌症基因组脆性两个主要过程。表观遗传因素,如TAD的,开/闭区域,DNA甲基化染色质组织,组蛋白标记都不够丰富,但做作出自己的贡献。作为一般的趋势,基团内的各个特征显示G-四并重复和CTCF,GABPA,RXRA,SP1,MAX和NR2F2转录因子的相对高的贡献。总体而言,癌症断点景观可以通过公预测热点和分散在基因组预测不良个人断点来表示。我们证明了热点突变基因具有和表观因素,而不是所有个别癌症断点只是随机噪声,但有一定的突变标记。此外,我们发现对断点诱变一些功能的远程操作。结合组学数据,癌症特异性个体特征的重要性,并加入到遥远的地方特色,对癌症形成断点预测模型,实现了70-90%的ROC AUC针对不同癌症类型;但是精度在2%仍然很低,此次召回不超过50%。在一方面,车型动力强劲与现有癌症断点和表观数据的大小相关,另一方面发现的癌症断点形成强有力的决定因素仍然是一项挑战。每个组和一组内的每个特征的预测信号的强度可以被转换成癌特异性断点突变签名。总的来说,我们的结果添加到癌症基因组重排过程的理解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号