Learning Conditional Latent Structures from Multiple Data Sources

机译：从多个数据源学习条件潜在结构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data usually present in heterogeneous sources. When dealing with multiple data sources, existing models often treat them independently and thus can not explicitly model the correlation structures among data sources. To address this problem, we propose a full Bayesian nonparametric approach to model correlation structures among multiple and heterogeneous datasets. The proposed framework, first, induces mixture distribution over primary data source using hierarchical Dirichlet processes (HDP). Once conditioned on each atom (group) discovered in previous step, context data sources are mutually independent and each is generated from hierarchical Dirichlet processes. In each specific application, which covariates constitute content or context(s) is determined by the nature of data. We also derive the efficient inference and exploit the conditional independence structure to propose (conditional) parallel Gibbs sampling scheme. We demonstrate our model to address the problem of latent activities discovery in pervasive computing using mobile data. We show the advantage of utilizing multiple data sources in terms of exploratory analysis as well as quantitative clustering performance.

机译：数据通常存在于异构源中。当处理多个数据源时，现有的模型通常会独立地对待它们，因此无法显式地对数据源之间的相关结构进行建模。为了解决这个问题，我们提出了一种完整的贝叶斯非参数方法来对多个数据集和异构数据集之间的相关结构进行建模。首先，提出的框架使用分层Dirichlet流程（HDP）在主要数据源上引起混合分布。一旦以上一步中发现的每个原子（基团）为条件，上下文数据源便是相互独立的，并且每个数据源都是由分层Dirichlet过程生成的。在每个特定的应用程序中，哪些协变量构成内容或上下文取决于数据的性质。我们还推导了有效的推论，并利用条件独立性结构来提出（条件）并行吉布斯采样方案。我们演示了我们的模型，以解决使用移动数据的普适计算中潜在活动发现的问题。我们展示了在探索性分析和定量聚类性能方面利用多个数据源的优势。

著录项

来源
《Pacific-Asia conference on knowledge discovery and data mining》|2015年|343-354|共12页
会议地点
作者
Viet Huynh; Dinh Phung; Long Nguyen; Svetha Venkatesh; Hung H. Bui;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Learning analytics on structured and unstructured heterogeneous data sources: Perspectives from procrastination, help-seeking, and machine-learning defined cognitive engagement [J] . Wu Jiun-Yu Computers & education . 2021,第Apra期

机译：学习结构化和非结构化异构数据来源的分析：拖延，寻求帮助和机器学习的视角，定义了认知参与
2. The Importance of Using Multiple Data Sources in Policy Assessments: Lessons From Two Conditional Cash Transfer Programs in New York City [J] . Yang Edith, Hendra Richard Evaluation review . 2018,第5a6期

机译：在政策评估中使用多个数据源的重要性：纽约市两个有条件现金转移计划的经验教训
3. Inferring latent task structure for Multitask Learning by Multiple Kernel Learning [J] . Christian Widmer, Nora C Toussaint, Yasemin Altun, BMC Bioinformatics . 2010,第SUPPLEMENTa8期

机译：通过多核学习推断多任务学习的潜在任务结构
4. Learning Conditional Latent Structures from Multiple Data Sources [C] . Viet Huynh, Dinh Phung, Long Nguyen, Pacific-Asia conference on knowledge discovery and data mining . 2015

机译：从多个数据源学习条件潜在结构
5. Learning Latent Community Structures in Network-based Data [D] . ?Fan, Ruituo 2020

机译：学习潜在社区结构，基于网络的数据
6. Joint conditional Gaussian graphical models with multiple sources of genomic data [O] . Hyonho Chun, Min Chen, Bing Li, 2013

机译：具有多个基因组数据源的联合条件高斯图形模型
7. Latent table discovery by semantic relationship extraction between unrelated sets of entity sets of structured data sources [O] . Gowri Shankar Ramaswamy, F Sagayaraj Francis 2011

机译：通过结构化数据源的不相关实体集之间的语义关系提取来进行潜在表发现

Learning Conditional Latent Structures from Multiple Data Sources

摘要

著录项

相似文献

相关主题

期刊订阅