首页> 美国卫生研究院文献>other >A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data
【2h】

A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data

机译:一种基于探针水平甲基化数据计算基因中心甲基化的特征选择算法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.
机译:DNA甲基化是重要的表观遗传事件,会影响发育和各种疾病(例如癌症)中的基因表达。了解DNA甲基化的作用机理对于下游分析很重要。在Illumina Infinium HumanMethylation 450K阵列中,每个基因都有数十个探针。给定所有这些探针的甲基化强度,有必要计算出哪些探针最能代表基因中心甲基化水平。在这项研究中,我们开发了一种基于顺序正向选择的特征选择算法,该算法利用不同的分类方法,使用探针级DNA甲基化数据计算以基因为中心的DNA甲基化。我们将我们的算法与其他特征选择算法进行了比较,例如具有递归特征消除的支持向量机,遗传算法和ReliefF。我们基于所选探针在其mRNA表达水平上的预测能力评估了所有方法,发现使用顺序正向选择算法进行的K最近邻分类比基于所有指标的其他算法表现更好。我们还观察到某些基因的转录活性比其他基因的转录活性对DNA甲基化变化更敏感。我们的算法仅使用DNA甲基化数据就能够高精度预测那些基因的表达。我们的结果还表明,那些DNA甲基化敏感基因富含与各种生物过程的调控相关的基因本体论术语。

著录项

  • 期刊名称 other
  • 作者

    Brittany Baur; Serdar Bozdag;

  • 作者单位
  • 年(卷),期 -1(11),2
  • 年度 -1
  • 页码 e0148977
  • 总页数 19
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号