首页> 美国卫生研究院文献>other >Genome-Wide Locations of Potential Epimutations Associated with Environmentally Induced Epigenetic Transgenerational Inheritance of Disease Using a Sequential Machine Learning Prediction Approach
【2h】

Genome-Wide Locations of Potential Epimutations Associated with Environmentally Induced Epigenetic Transgenerational Inheritance of Disease Using a Sequential Machine Learning Prediction Approach

机译:使用顺序机器学习预测方法与环境诱导的表观遗传性遗传疾病的遗传相关的潜在基因突变的全基因组位置。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Environmentally induced epigenetic transgenerational inheritance of disease and phenotypic variation involves germline transmitted epimutations. The primary epimutations identified involve altered differential DNA methylation regions (DMRs). Different environmental toxicants have been shown to promote exposure (i.e., toxicant) specific signatures of germline epimutations. Analysis of genomic features associated with these epimutations identified low-density CpG regions (<3 CpG / 100bp) termed CpG deserts and a number of unique DNA sequence motifs. The rat genome was annotated for these and additional relevant features. The objective of the current study was to use a machine learning computational approach to predict all potential epimutations in the genome. A number of previously identified sperm epimutations were used as training sets. A novel machine learning approach using a sequential combination of Active Learning and Imbalance Class Learner analysis was developed. The transgenerational sperm epimutation analysis identified approximately 50K individual sites with a 1 kb mean size and 3,233 regions that had a minimum of three adjacent sites with a mean size of 3.5 kb. A select number of the most relevant genomic features were identified with the low density CpG deserts being a critical genomic feature of the features selected. A similar independent analysis with transgenerational somatic cell epimutation training sets identified a smaller number of 1,503 regions of genome-wide predicted sites and differences in genomic feature contributions. The predicted genome-wide germline (sperm) epimutations were found to be distinct from the predicted somatic cell epimutations. Validation of the genome-wide germline predicted sites used two recently identified transgenerational sperm epimutation signature sets from the pesticides dichlorodiphenyltrichloroethane (DDT) and methoxychlor (MXC) exposure lineage F3 generation. Analysis of this positive validation data set showed a 100% prediction accuracy for all the DDT-MXC sperm epimutations. Observations further elucidate the genomic features associated with transgenerational germline epimutations and identify a genome-wide set of potential epimutations that can be used to facilitate identification of epigenetic diagnostics for ancestral environmental exposures and disease susceptibility.
机译:环境诱导的表观遗传疾病和表型变异的遗传涉及种系传递的表观突变。鉴定出的主要表位突变涉及改变的差异DNA甲基化区域(DMR)。已经显示出不同的环境毒物促进种系表位突变的暴露(即,毒物)特异性特征。与这些突变相关的基因组特征分析确定了称为CpG沙漠的低密度CpG区域(<3 CpG / 100bp)和许多独特的DNA序列基序。为这些和其他相关功能注释了大鼠基因组。当前研究的目的是使用机器学习计算方法来预测基因组中所有潜在的突变。许多先前确定的精子突变被用作训练集。开发了一种新颖的机器学习方法,该方法使用主动学习和不平衡课堂学习者分析的顺序组合。跨代精子表观变异分析确定了大约50K个单个位点,平均大小为1 kb,并且确定了3,233个区域,其中至少三个相邻位点的平均大小为3.5 kb。确定了一些最相关的基因组特征,其中低密度CpG沙漠是所选特征的关键基因组特征。一项类似的独立研究与跨代体细胞matic灭训练集确定了全基因组预测位点的较少的1,503个区域以及基因组特征贡献的差异。发现预测的全基因组种系(精子)表位突变与预测的体细胞表位突变不同。全基因组种系预测位点的验证使用了两个最近鉴定出的来自农药二氯二苯基三氯乙烷(DDT)和甲氧基氯(MXC)暴露谱系F3世代的世代精子表位特征集。对该阳性验证数据集的分析表明,所有DDT-MXC精子表观突变的预测准确性均为100%。观察结果进一步阐明了与跨世代种系表位突变相关的基因组特征,并确定了全基因组范围的潜在表位突变,可用于促进鉴定祖先环境暴露和疾病易感性的表观遗传学诊断。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号