...
首页> 外文期刊>BMC Bioinformatics >Protein complex detection based on partially shared multi-view clustering
【24h】

Protein complex detection based on partially shared multi-view clustering

机译:基于部分共享多视图聚类的蛋白质复合物检测

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Protein complexes are the key molecular entities to perform many essential biological functions. In recent years, high-throughput experimental techniques have generated a large amount of protein interaction data. As a consequence, computational analysis of such data for protein complex detection has received increased attention in the literature. However, most existing works focus on predicting protein complexes from a single type of data, either physical interaction data or co-complex interaction data. These two types of data provide compatible and complementary information, so it is necessary to integrate them to discover the underlying structures and obtain better performance in complex detection. Results In this study, we propose a novel multi-view clustering algorithm, called the Partially Shared Multi-View Clustering model (PSMVC), to carry out such an integrated analysis. Unlike traditional multi-view learning algorithms that focus on mining either consistent or complementary information embedded in the multi-view data, PSMVC can jointly explore the shared and specific information inherent in different views. In our experiments, we compare the complexes detected by PSMVC from single data source with those detected from multiple data sources. We observe that jointly analyzing multi-view data benefits the detection of protein complexes. Furthermore, extensive experiment results demonstrate that PSMVC performs much better than 16 state-of-the-art complex detection techniques, including ensemble clustering and data integration techniques. Conclusions In this work, we demonstrate that when integrating multiple data sources, using partially shared multi-view clustering model can help to identify protein complexes which are not readily identifiable by conventional single-view-based methods and other integrative analysis methods. All the results and source codes are available on https://github.com/Oyl-CityU/PSMVC .
机译:背景技术蛋白质复合物是执行许多基本生物学功能的关键分子实体。近年来,高通量实验技术已经产生了大量的蛋白质相互作用数据。结果,用于蛋白质复合物检测的此类数据的计算分析在文献中受到越来越多的关注。然而,大多数现有的工作集中于从单一类型的数据(物理相互作用数据或共复合物相互作用数据)预测蛋白质复合物。这两种类型的数据提供了兼容和互补的信息,因此有必要将它们集成以发现底层结构并在复杂检测中获得更好的性能。结果在这项研究中,我们提出了一种新颖的多视图聚类算法,称为部分共享多视图聚类模型(PSMVC),以进行这种集成分析。与传统的多视图学习算法专注于挖掘嵌入在多视图数据中的一致或补充信息不同,PSMVC可以共同探索不同视图中固有的共享信息和特定信息。在我们的实验中,我们将PSMVC从单个数据源检测到的复合物与从多个数据源检测到的复合物进行了比较。我们观察到,联合分析多视图数据有益于蛋白质复合物的检测。此外,大量的实验结果表明,PSMVC的性能远远优于16种最新的复杂检测技术,包括集成聚类和数据集成技术。结论在这项工作中,我们证明了在集成多个数据源时,使用部分共享的多视图聚类模型可以帮助识别传统的基于单视图的方法和其他集成分析方法不易识别的蛋白质复合物。所有结果和源代码都可以在https://github.com/Oyl-CityU/PSMVC上获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号