首页> 外文期刊>JMLR: Workshop and Conference Proceedings >A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models
【24h】

A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models

机译:一种快速可扩展的联合估计器,用于集成在学习多个相关的稀疏高斯图形模型中的附加知识

获取原文
           

摘要

We consider the problem of including additional knowledge in estimating sparse Gaussian graphical models (sGGMs) from aggregated samples, arising often in bioinformatics and neuroimaging applications. Previous joint sGGM estimators either fail to use existing knowledge or cannot scale-up to many tasks (large $K$) under a high-dimensional (large $p$) situation. In this paper, we propose a novel underline{J}oint underline{E}lementary underline{E}stimator incorporating additional underline{K}nowledge (JEEK) to infer multiple related sparse Gaussian Graphical models from large-scale heterogeneous data. Using domain knowledge as weights, we design a novel hybrid norm as the minimization objective to enforce the superposition of two weighted sparsity constraints, one on the shared interactions and the other on the task-specific structural patterns. This enables JEEK to elegantly consider various forms of existing knowledge based on the domain at hand and avoid the need to design knowledge-specific optimization. JEEK is solved through a fast and entry-wise parallelizable solution that largely improves the computational efficiency of the state-of-the-art $O(p^5K^4)$ to $O(p^2K^4)$. We conduct a rigorous statistical analysis showing that JEEK achieves the same convergence rate $O(log(Kp)_{tot})$ as the state-of-the-art estimators that are much harder to compute. Empirically, on multiple synthetic datasets and one real-world data from neuroscience, JEEP outperforms the speed of the state-of-arts significantly while achieving the same level of prediction accuracy.
机译:我们考虑了在从聚集样本中估计稀疏高斯图形模型(sGGM)时要包括其他知识的问题,这通常是在生物信息学和神经影像学应用中出现的。先前的sGGM联合估计器要么无法使用现有知识,要么无法在高维度(大$ p $)情况下扩展到许多任务(大$ K $)。在本文中,我们提出了一种新颖的 underline {J} oint underline {E}基础 underline {E} stimator,它结合了额外的 underline {K} nowledge(JEEK)来从大规模异类推断多个相关的稀疏高斯图形模型数据。使用领域知识作为权重,我们设计了一种新颖的混合范数作为最小化目标,以强制执行两个加权稀疏性约束的叠加,一个基于共享交互,另一个基于特定任务的结构模式。这使JEEK可以根据手边的领域优雅地考虑各种形式的现有知识,而无需设计特定于知识的优化。 JEEK通过一种快速且入门级的可并行化解决方案得以解决,该解决方案大大提高了最新技术水平$ O(p ^ 5K ^ 4)$到$ O(p ^ 2K ^ 4)$的计算效率。我们进行了严格的统计分析,结果表明JEEK与最难以估计的最新估算器实现了相同的收敛速度$ O( log(Kp)/ n_ {tot})$。从经验上讲,在多个合成数据集和来自神经科学的一个真实世界数据上,JEEP的性能明显优于最新技术,同时达到了相同水平的预测准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号