首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Learning Mixtures of Markov Chains from Aggregate Data with Structural Constraints
【24h】

Learning Mixtures of Markov Chains from Aggregate Data with Structural Constraints

机译:从具有结构约束的综合数据中学习马尔可夫链的混合

获取原文
获取原文并翻译 | 示例

摘要

Statistical models based on Markov chains, especially mixtures of Markov chains, have recently been studied and demonstrated to be effective in various data mining applications such as tourist flow analysis, animal migration modeling, and transportation administration. Nevertheless, the research so far has mainly focused on analyzing data at individual levels. Due to security and privacy reasons, however, the observations in practice usually consist of coarse-grained statistics of individual data, aggregate data, rendering learning mixtures of Markov chains an even more challenging problem. In this work, we show that this challenging problem, although intractable in its original form, can be solved approximately by posing structural constraints on the transition matrices. The proposed structural constraints include specifying active state sets corresponding to the chains and adding a pairwise sparse regularization term on transition matrices. Based on these two structural constraints, we propose a constrained least-squares method to learn mixtures of Markov chains. We further develop a novel iterative algorithm that decomposes the overall problem into a set of convex subproblems and solves each subproblem efficiently, making it possible to effectively learn mixtures of Markov chains from aggregate data. We propose a framework for generating synthetic data and analyze the complexity of our algorithm. Additionally, the empirical results of the convergence and the robustness of our algorithm are also presented. These results demonstrate the effectiveness and efficiency of the proposed algorithm, comparing with traditional methods. Experimental results on real-world data sets further validate that our algorithm can be used to solve practical problems.
机译:最近研究了基于马尔可夫链的统计模型,尤其是马尔可夫链的混合模型,并证明了其在各种数据挖掘应用中的有效性,例如游客流分析,动物迁移建模和运输管理。但是,到目前为止,研究主要集中在分析各个级别的数据。但是,由于安全和隐私原因,实际上,观察通常包括单个数据的粗粒度统计,聚合数据,使马尔可夫链的学习混合成为更具挑战性的问题。在这项工作中,我们证明了这个具有挑战性的问题,尽管其原始形式难以解决,但可以通过在过渡矩阵上施加结构约束来近似解决。提出的结构约束包括指定与链相对应的活动状态集,并在过渡矩阵上添加成对的稀疏正则项。基于这两个结构约束,我们提出了一种约束最小二乘法来学习马尔可夫链的混合。我们进一步开发了一种新颖的迭代算法,该算法将整个问题分解为一组凸子问题,并有效地解决了每个子问题,从而有可能从汇总数据中有效学习马尔可夫链的混合。我们提出了一个用于生成综合数据的框架,并分析了算法的复杂性。此外,还给出了算法收敛性和鲁棒性的经验结果。这些结果证明了与传统方法相比,该算法的有效性和效率。在实际数据集上的实验结果进一步证明了我们的算法可用于解决实际问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号