首页> 外文会议>Annual International Conference of the IEEE Engineering in Medicine and Biology Society >A comparative study for characterisation and prediction of tissue-specific DNA methylation of CpG islands in chromosomes 6, 20 - and 22
【24h】

A comparative study for characterisation and prediction of tissue-specific DNA methylation of CpG islands in chromosomes 6, 20 - and 22

机译:染色体6,20 - 和22中CPG岛组织特异性DNA甲基化的表征与预测的对比研究

获取原文

摘要

Advanced technology has enabled identification of tissue-specific methylated CpG islands of different human tissues. As methylation of CpG islands is involved in various biological phenomena and function of the DNA methylation is linked to various human diseases such as cancer, analysis of the CpG islands has become important and useful in characterising and modelling biological phenomena and understanding mechanism of such diseases. However, analysis of the data associated with the CpG islands is a quite new and challenging subject in bioinformatics, systems biology and epigenetics. In this paper, the problem associated with the prediction of methylated and unmethylated CpG islands on human chromosomes 6, 20 and 22 is addressed. In order to carry out the prediction, a data set of 451 samples of the CpG islands from 12 tissues of chromosomes 6, 20 and 22 was obtained. In addition, four different feature sub-sets totalling 50 attributes that characterise the methylated and unmethylated groups are extracted for each sample. These four feature sub-sets are (1) Tissue-specific CpGI methylation, (2) Evolutionary and conservation, (3) Sequence distribution and (4) DNA structure and properties. Due to the nature of this unbalanced data set, in order to avoid disadvantages of traditional leave-one-out (LOO) and m-fold cross validation methods, the LOO method is modified by incorporating the m-fold cross validation approach. The K-nearest neighbour classifier is then adapted for the prediction. The results obtained through 450 different comprehensive analyses show that the methylated CpG islands can be distinguished from the unmethylated CpG islands by a predictive accuracy of between 93,33% and 100%. More importantly, the modified LOO identifies more clearly and reliably these two groups when the feature sub-sets are combined. In addition, the modified-LOO cross validation identified the tissue-specific CpGI methylation feature sub-set as one of the most significant sets whereas it is not the case in the traditional cross validation methods.
机译:先进的技术能够识别不同人组织的组织特异性甲基化CpG岛。随着CpG岛的甲基化涉及各种生物现象和DNA甲基化的功能与癌症如癌症等各种人类疾病相关,CPG岛的分析是重要的,在表征和建模这种疾病的理解机制和理解机制方面是重要的。然而,与CPG岛相关的数据分析是生物信息学,系统生物学和表观遗传学中的一个相当新的和具有挑战性的。在本文中,解决了与人染色体6,20和22上甲基化和未甲基化CpG岛的预测相关的问题。为了执行预测,获得了来自染色体6,20和22组织的CPG岛的451个样本的数据集。另外,为每个样品提取四种不同的特征子组总共50个甲基化和未甲基化基团的属性。这四个特征子集是(1)组织特异性CpGI甲基化,(2)进化和守护,(3)序列分布和(4)DNA结构和性质。由于这种不平衡数据集的性质,为了避免传统休假(LOO)和M折交叉验证方法的缺点,通过结合M折叠交叉验证方法来修改LOO方法。然后,适用于预测的k-collect邻分类。通过450种不同的综合分析获得的结果表明,甲基化的CPG岛可以通过预测精度与未甲基化的CPG岛区分开,预测精度为9.3,33%和100%。更重要的是,当特征子集合组合时,修改的LOO更清晰且可靠地识别这两个组。此外,修改的-OO交叉验证将组织特异性CPGI甲基化特征子组鉴定为最重要的组之一,而在传统的交叉验证方法中并非如此。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号