全基因组关联研究(Genome-wide association studies,GWAS)是指在基因水平上进行关联分析来寻找致病基因的方法.传统的研究方法没有考虑到基因之间的相互作用,而且在复杂的因素情形下往往效率、准确率较低.针对上述难题,本文提出一种基于互信息的结构性关键SNPs集合选取方法.在互信息理论和仿真数据的基础之上,逆向构建SNPs互信息网络,给定互信息一个阈值范围,找到对应阈值下相关统计量进行比较分析,选取出合适的阈值.根据选取的阈值,筛选出对网络结构有明显影响效果的"结构性关键SNPs".实验结果表明:本文采用的参数取值方法能够准确快速地筛选出对网络结构有明显影响效果的关键SNPs.%Genome-wide association studies (GWAS) refer to the method that uses correlation analysis to identify disease associated genes. Traditional research method did not consider the interaction between genes and had low accuracy and efficiency in the case of complex factors. Aimed at these aforementioned problems, this paper presents a key SNPs selecting algorithm based on mutual information. It constructs reversely the SNPs interaction network using simulation data based on the theory of mutual information and compares the difference of the statistics of SNPs interaction networks between case and control groups with the increase of the mutual information threshold. According to the selected threshold, we select the structural key SNPs. The results of experiments show that the method of parameter selection presented in this paper is useful to select the structural key SNPs.
展开▼