首页> 美国卫生研究院文献>Frontiers in Genetics >Adapting Community Detection Algorithms for Disease Module Identification in Heterogeneous Biological Networks
【2h】

Adapting Community Detection Algorithms for Disease Module Identification in Heterogeneous Biological Networks

机译:自适应社区检测算法在异构生物网络疾病模块识别中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Biological networks catalog the complex web of interactions happening between different molecules, typically proteins, within a cell. These networks are known to be highly modular, with groups of proteins associated with specific biological functions. Human diseases often arise from the dysfunction of one or more such proteins of the biological functional group. The ability, to identify and automatically extract these modules has implications for understanding the etiology of different diseases as well as the functional roles of different protein modules in disease. The recent DREAM challenge posed the problem of identifying disease modules from six heterogeneous networks of proteins/genes. There exist many community detection algorithms, but all of them are not adaptable to the biological context, as these networks are densely connected and the size of biologically relevant modules is quite small. The contribution of this study is 3-fold: first, we present a comprehensive assessment of many classic community detection algorithms for biological networks to identify non-overlapping communities, and propose heuristics to identify small and structurally well-defined communities—core modules. We evaluated our performance over 180 GWAS datasets. In comparison to traditional approaches, with our proposed approach we could identify 50% more number of disease-relevant modules. Thus, we show that it is important to identify more compact modules for better performance. Next, we sought to understand the peculiar characteristics of disease-enriched modules and what causes standard community detection algorithms to detect so few of them. We performed a comprehensive analysis of the interaction patterns of known disease genes to understand the structure of disease modules and show that merely considering the known disease genes set as a module does not give good quality clusters, as measured by typical metrics such as modularity and conductance. We go on to present a methodology leveraging these known disease genes, to also include the neighboring nodes of these genes into a module, to form good quality clusters and subsequently extract a “gold-standard set” of disease modules. Lastly, we demonstrate, with justification, that “overlapping” community detection algorithms should be the preferred choice for disease module identification since several genes participate in multiple biological functions.
机译:生物网络将细胞内不同分子(通常是蛋白质)之间发生的复杂相互作用网络分类。已知这些网络是高度模块化的,具有与特定生物学功能相关的蛋白质组。人类疾病通常源于一种或多种生物功能组蛋白的功能障碍。识别并自动提取这些模块的能力对于理解不同疾病的病因以及疾病中不同蛋白质模块的功能作用具有重要意义。最近的DREAM挑战提出了从六个蛋白质/基因异质网络中识别疾病模块的问题。存在许多社区检测算法,但是它们都不适合生物学环境,因为这些网络紧密连接并且生物学相关模块的大小很小。这项研究的贡献是三方面的:首先,我们对生物网络中许多经典的社区检测算法进行全面评估,以识别不重叠的社区,并提出启发式方法以识别小型且结构明确的社区-核心模块。我们评估了180个GWAS数据集的性能。与传统方法相比,通过我们提出的方法,我们可以确定与疾病相关的模块数量增加50%。因此,我们表明,确定更紧凑的模块以获得更好的性能非常重要。接下来,我们试图了解疾病丰富模块的独特特征,以及是什么导致标准社区检测算法检测到如此少的模块。我们对已知疾病基因的相互作用模式进行了全面分析,以了解疾病模块的结构,并表明仅考虑将已知疾病基因设置为模块并不能提供高质量的簇,这是通过诸如模块化和电导率等典型指标来衡量的。我们继续介绍一种利用这些已知疾病基因的方法,还将这些基因的相邻节点包含在模块中,以形成高质量的簇,然后提取疾病模块的“黄金标准”。最后,我们有道理地证明,“重叠”社区检测算法应该是疾病模块识别的首选,因为几个基因参与了多种生物学功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号