...
首页> 外文期刊>BMC Bioinformatics >Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs
【24h】

Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs

机译:使用k部分图的模糊聚类构造异构生物信息

获取原文

摘要

Background Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of k -partite graphs. These graphs contain k different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. Results Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a k -partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted k -partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. Conclusions In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy k -partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.
机译:背景技术生物信息学中的广泛而自动化的数据集成促进了大型,复杂的生物网络的构建。但是,挑战在于对这些网络的解释。尽管大多数研究都集中在单部分或二部分情况下,但我们讨论了k部分图的更一般但普遍的情况。这些图包含k个不同的节点类型,并且仅允许在不同类型的节点之间建立链接。为了揭示它们的结构组织并以更粗粒度的方式描述所包含的信息,我们询问如何在每个节点类型内检测群集。结果由于生物网络中的实体通常具有多种功能,因此参与了多个群集,因此我们开发了一种k部分图划分算法,该算法允许重叠(模糊)群集。它为每个节点确定每个集群的隶属程度。此外,该算法估计连接提取的群集的加权k部分图。我们的方法快速有效,模仿了非负矩阵分解算法中常用的乘法更新规则。它有助于在选定的规模上分解网络,因此可以在各种分辨率级别上分析和解释结构。将我们的算法应用于三方疾病-基因-蛋白质复合物网络,我们能够将该图大规模构建为功能相关且具有生物学意义的簇。在本地,较小的集群可以对集群的元素进行重新分类或注释。我们以转录因子MECP2为例。结论为了应对来自生物医学文献的大量信息,我们需要解决在具有多种类型节点的大型网络中寻找结构的挑战。为此,我们提出了一种新颖的模糊k部分图划分算法,该算法可以以综合方式分解这些对象。我们在人工和真实数据上都验证了我们的方法。它很容易适用于任何其他问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号