首页> 外文会议>Annual International Conference on Research in Computational Molecular Biology >Connectedness Profiles in Protein Networks for the Analysis of Gene Expression Data
【24h】

Connectedness Profiles in Protein Networks for the Analysis of Gene Expression Data

机译:蛋白质网络中的关联谱分析基因表达数据

获取原文

摘要

Knowledge about protein function is often encoded in the form of large and sparse undirected graphs where vertices are proteins and edges represent their functional relationships. One elementary task in the computational utilization of these networksis that of quantifying the density of edges, referred to as connectedness, inside a prescribed protein set. For instance, many functional modules can be identified because of their high connectedness. Since individual proteins can have very different numbers of interactions, a connectedness measure should be well-normalized for vertex degree. Namely, its distribution across random sets of vertices should not be affected when these sets are biased for hubs. We show that such degree-robustness can be achieved via an analytical framework based on a model of random graph with given expected degrees. We also introduce the concept of connectedness profile, which characterizes the relation between adjacency in a graph and a prescribed order of its vertices. Astraightforward application to gene expression data and protein networks is the identification of tissue-specific functional modules or cellular processes perturbed in an experiment. The strength of the mapping between gene-expression score and interaction in the network is measured by the area of the connectedness profile. Deriving the distribution of this area under the random graph enables us to define degree-robust statistics that can be computed in O (M), M being the network size. These statisticscan identify groups of microarray experiments that are pathway-coherent, and more generally, vertex attributes that relate to adjacency in a graph.
机译:关于蛋白质函数的知识通常以大型和稀疏无向图的形式编码,其中顶点是蛋白质和边缘代表其功能关系。在这些网络的计算利用中的一个基本任务,其在规定的蛋白集内量化边缘的密度,称为关联的密度。例如,由于其高分辨率,可以识别许多功能模块。由于个体蛋白质可以具有非常不同的相互作用,因此应为顶点度良好地归一化连接度量。即,当这些集合被偏置为集线器时,它横跨随机顶点的分布不应受到影响。我们表明,这种程度鲁棒性可以通过基于具有给定预期度的随机图模型来实现这种程度稳健性。我们还介绍了关联配置文件的概念,其特征在于图形的邻接与其顶点的规定顺序之间的关系。 Astraightforward申请基因表达数据和蛋白质网络是在实验中扰乱的组织特异性功能模块或细胞过程的鉴定。基因表达分数与网络中的相互作用的映射的强度由连接性曲线的区域测量。在随机图下导出该区域的分布使我们能够定义可以在O(M)中计算的程度强大的统计信息,M是网络大小。这些统计信息识别具有途径相干的微阵列实验组,更通常是与图中的邻接相关的顶点属性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号