首页> 外文会议>IEEE International Conference on Innovations in Intelligent Systems and Applications >Non-parametric Discovery of Topics and Communities in Distributed and Streaming Environments
【24h】

Non-parametric Discovery of Topics and Communities in Distributed and Streaming Environments

机译:非参数发现分布式和流环境中的主题和社区

获取原文

摘要

Several recent works have focused on improving latent-space based modeling of streaming count-based data such as streaming textual feeds and evolving social networks. However, many of these models do not inherently scale to large data sets, nor do they accommodate drift in the inferred latent factors (e.g. topics, social groups) over time. In addition, the functional form of distributed and streaming processing architectures recently introduced in industry places constraints on how dynamic algorithms can be expressed, for example, that they must be inherently state-ful. We propose a comprehensive and flexible approach to distributed and dynamic inference of Bayesian count factorization models, focusing on a recently introduced nonparametric, joint topic-community factorization model called Joint Gamma Process Poisson Factorization (JGPPF). The method is illustrated in an Apache Spark implementation using twelve years of U.S. Senate voting records.
机译:几个最近的作品集中在改善基于流基于计数的数据的基于潜在空间的建模,例如流媒体文本馈送和不断发展的社交网络。然而,许多这些模型并不固有地扩展到大型数据集,也不会随着时间推移推断的潜在因子(例如主题,社会群体)的漂移。此外,例如,最近在工业位置引入的分布式和流处理架构的功能形式是如何表示动态算法的限制,例如,它们必须固有地态度。我们提出了一种全面而灵活的贝叶斯计数分解模型分布和动态推理,专注于最近引入的非参考,联合主题 - 社区分解模型称为联合伽玛处理泊松分子(JGPPF)。该方法在Apache Spark实现中示出了使用12年的美国参议院投票记录。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号