首页> 外文会议>SIAM International Conference on Data Mining >Discovering bursts revisited: guaranteed optimization of the model parameters
【24h】

Discovering bursts revisited: guaranteed optimization of the model parameters

机译:发现重新审批的突发:保证优化模型参数

获取原文
获取外文期刊封面目录资料

摘要

One of the classic data mining tasks is to discover bursts, time intervals, where events occur at abnormally high rate. In this paper we revisit Kleinberg's seminal work, where bursts are discovered by using exponential distribution with a varying rate parameter: the regions where it is more advantageous to set the rate higher are deemed bursty. The model depends on two parameters, the initial rate and the change rate. The initial rate, that is, the rate that is used when there are no burstiness was set to the average rate over the whole sequence. The change rate is provided by the user. We argue that these choices are suboptimal: it leads to worse likelihood, and may lead to missing some existing bursts. We propose an alternative problem setting, where the model parameters are selected by optimizing the likelihood of the model. While this tweak is trivial from the problem definition point of view, this changes the optimization problem greatly. To solve the problem in practice, we propose efficient (1 + ε) approximation schemes. Finally, we demonstrate empirically that with this setting we are able to discover bursts that would have otherwise be undetected.
机译:其中一个经典的数据挖掘任务是发现突发,时间间隔,其中事件处于异常高速率。在本文中,我们通过使用不同速率参数使用指数分布来检测Kleinberg的开创性工作:将更有利的区域设定速率更高的区域被视为突发。该模型取决于两个参数,初始速率和变化率。初始速率,即,在没有突发时使用的速率被设定为整个序列的平均速率。更改率由用户提供。我们认为这些选择是次优:它导致糟糕的可能性,可能导致缺少一些现有的爆发。我们提出了一种替代问题设置,其中通过优化模型的可能性来选择模型参数。虽然从问题定义的角度来看,这次调整是微不足道的,但这大大改变了优化问题。为了解决实践中的问题,我们提出了高效(1±ε)近似方案。最后,我们经验证明,在这种情况下,我们能够发现否则无法被发现的突发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号