首页> 中文期刊> 《生物信息学》 >基于对数线性模型的酵母基因转录调控模体分析

基于对数线性模型的酵母基因转录调控模体分析

         

摘要

ldentifying the transcriptional factor binding sites ( or motifs) of eukaryotic genes is a major work in the post - genomics era. The accuracy of motif identification could be improved if we analyze co - expression or co -regulated genes at the same time. In this paper we analyze the motifs common used in ribosomal protein genes of yeast, counting the number of genes including a certain motif, based on log -linear model of contingency table.Then the over - represented motifs relative to background sequences are further filtered out with a U - statistics.These motifs are potential transcriptional regulatory elements of yeast RP genes, 90% of which are accordance with the transcription factor binding sites verified by experimental analyses. The advantage of this method is to extract the motifs shared by a set of gene promoters in a strict statistical standard , which overcomes the fuzzy judge in previous work. This method could also be used to search combinatorial regulation moLif pairs efficiently in co - regulated genes. A phenomenon has been discovered that there is an obovious relevancy between the Pearson ' s correlation coefficient, which reflects the correlation extent of two attributes in contingency table, and the interaction effect of log -linear model. This result suggests that we eould evaluate the correlation of two attributes by the interaction effect of log - linear model.%识别真核基因的转录因子结合位点(或称模体)是后基因组时代的一项主要工作,对共表达或共调控的基因同时进行分析可以提高模体识别的准确性.本文基于2×2列联表的对数线性模型,以模体出现的基因条数计数,对酵母核糖体蛋白(RP)基因普遍使用的转录调控模体进行分析,然后用U-检验进一步筛选出相对于背景序列来说过表达的模体.这些模体为酵母RP基因潜在的转录调控元件,与实验获得的转录因子结合位点的符合率达90%.本方法的优点在于用严格的统计标准在一组基因启动子中搜索普遍使用的模体,克服了以往分析中对模体使用普遍性的模糊判断.本文的方法也可以有效地搜索共表达基因族的组合调控模体对.研究中还发现一个现象:2×2列联表中反映属性相关程度的Pearson相关系数与对数线性模型的交互效应之间存在着明显的相关性.这一结果提示,可以用对数线性模型的交互效应来评价两属性的关联情况.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号