首页> 外文会议>IEEE International Conference on Data Engineering >Distributed Publish/Subscribe Query Processing on the Spatio-Textual Data Stream
【24h】

Distributed Publish/Subscribe Query Processing on the Spatio-Textual Data Stream

机译:时空文本数据流上的分布式发布/订阅查询处理

获取原文

摘要

Huge amount of data with both space and text information, e.g., geo-tagged tweets, is flooding on the Internet. Such spatio-textual data stream contains valuable information for millions of users with various interests on different keywords and locations. Publish/subscribe systems enable efficient and effective information distribution by allowing users to register continuous queries with both spatial and textual constraints. However, the explosive growth of data scale and user base has posed challenges to the existing centralized publish/subscribe systems for spatiotextual data streams. In this paper, we propose our distributed publish/subscribe system, called PS2Stream, which digests a massive spatio-textual data stream and directs the stream to target users with registered interests. Compared with existing systems, PS2Stream achieves a better workload distribution in terms of both minimizing the total amount of workload and balancing the load of workers. To achieve this, we propose a new workload distribution algorithm considering both space and text properties of the data. Additionally, PS2Stream supports dynamic load adjustments to adapt to the change of the workload, which makes PS2Stream adaptive. Extensive empirical evaluation, on commercial cloud computing platform with real data, validates the superiority of our system design and advantages of our techniques on system performance improvement.
机译:互联网上大量涌入大量带有空间和文本信息的数据,例如带有地理标签的推文。这种时空文本数据流包含对数百万对不同关键字和位置具有不同兴趣的用户的有价值的信息。发布/订阅系统允许用户注册具有空间和文本限制的连续查询,从而实现了有效而有效的信息分发。然而,数据规模和用户群的爆炸性增长给现有的用于空间文本数据流的集中式发布/订阅系统带来了挑战。在本文中,我们提出了一种称为PS2Stream的分布式发布/订阅系统,该系统可消化大量的时空文本数据流,并将该流定向到具有注册兴趣的目标用户。与现有系统相比,PS2Stream在最小化工作负载总量和平衡工作人员负载方面实现了更好的工作负载分配。为此,我们提出了一种新的工作量分配算法,该算法同时考虑了数据的空间和文本属性。此外,PS2Stream支持动态负载调整以适应工作负载的变化,从而使PS2Stream具有自适应性。在具有实际数据的商业云计算平台上进行的广泛经验评估,验证了我们的系统设计的优越性以及我们的技术在改善系统性能方面的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号