首页> 美国卫生研究院文献>Frontiers in Genetics >Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks
【2h】

Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks

机译:递归间接路径模块(RIP-M)用于检测RNA-Seq共表达网络中的群落结构

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identify an optimal partition of the network where the individual clusters are neither too large, prohibiting interpretation, nor too small, precluding general inference. Newman Modularity is a spectral clustering algorithm that automatically finds the number of clusters, but for many biological networks the cluster sizes are suboptimal. In this work, we generalize Newman Modularity to incorporate information from indirect paths in RNA-Seq co-expression networks. We implement a merge-and-split algorithm that allows the user to constrain the range of cluster sizes: large enough to capture genes in relevant pathways, yet small enough to resolve distinct functions. We investigate the properties of our recursive indirect-pathways modularity (RIP-M) and compare it with other clustering methods using simulated co-expression networks and RNA-seq data from an influenza vaccine response study. RIP-M had higher cluster assignment accuracy than Newman Modularity for finding clusters in simulated co-expression networks for all scenarios, and RIP-M had comparable accuracy to Weighted Gene Correlation Network Analysis (WGCNA). RIP-M was more accurate than WGCNA for modest hard thresholds and comparable for high, while WGCNA was slightly more accurate for soft thresholds. In the vaccine study data, RIP-M and WGCNA enriched for a comparable number of immunologically relevant pathways.
机译:共表达网络中的基因簇通常用作基因集富集检测的功能单元,并越来越多地用作统计推断和样本分类的特征(属性构建)。为此目的,群集的实际挑战之一是确定网络的最佳分区,其中各个群集既不能太大,也不能禁止解释,也不能太小,而不能进行一般推断。 Newman Modularity是一种光谱聚类算法,可以自动找到簇的数量,但是对于许多生物网络而言,簇的大小都不理想。在这项工作中,我们对Newman Modularity进行了概括,以将来自RNA-Seq共表达网络中间接路径的信息纳入其中。我们实现了合并和分割算法,该算法允许用户限制簇大小的范围:足够大以捕获相关途径中的基因,而足够小以解析不同的功能。我们调查了递归间接途径模块性(RIP-M)的性质,并将其与其他聚类方法进行了比较,使用模拟共表达网络和来自流感疫苗反应研究的RNA-seq数据。在所有情况下,RIP-M在模拟共表达网络中查找聚类时都比Newman Modularity具有更高的聚类分配准确度,而RIP-M具有与加权基因相关网络分析(WGCNA)相当的准确性。对于中等的硬阈值,RIP-M比WGCNA更准确,而对于高的阈值,RIP-M则可比,而对于软阈值,WGCNA则更准确。在疫苗研究数据中,RIP-M和WGCNA丰富了相当数量的免疫学相关途径。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号