Filtering Data Streams for Entity-Based Continuous Queries

Cheng R.; Ben Kao; Kwan A.; Prabhakar S.; Yicheng Tu

首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Filtering Data Streams for Entity-Based Continuous Queries

【24h】

Filtering Data Streams for Entity-Based Continuous Queries

机译：过滤基于实体的连续查询的数据流

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The idea of allowing query users to relax their correctness requirements in order to improve performance of a data stream management system (e.g., location-based services and sensor networks) has been recently studied. By exploiting the maximum error (or tolerance) allowed in query answers, algorithms for reducing the use of system resources have been developed. In most of these works, however, query tolerance is expressed as a numerical value, which may be difficult to specify. We observe that in many situations, users may not be concerned with the actual value of an answer, but rather which object satisfies a query (e.g., "who is my nearest neighbor?Â¿). In particular, an entity-based query returns only the names of objects that satisfy the query. For these queries, it is possible to specify a tolerance that is "nonvalue-based.Â¿ In this paper, we study fraction-based tolerance, a type of nonvalue-based tolerance, where a user specifies the maximum fractions of a query answer that can be false positives and false negatives. We develop fraction-based tolerance for two major classes of entity-based queries: 1) nonrank-based query (e.g., range queries) and 2) rank-based query (e.g., k-nearest-neighbor queries). These definitions provide users with an alternative to specify the maximum tolerance allowed in their answers. We further investigate how these definitions can be exploited in a distributed stream environment. We design adaptive filter algorithms that allow updates be dropped conditionally at the data stream sources without affecting the overall query correctness. Extensive experimental results show that our protocols reduce the use of network and energy resources significantly.

机译：最近研究了允许查询用户放宽其正确性要求以改善数据流管理系统（例如，基于位置的服务和传感器网络）的性能的想法。通过利用查询答案中允许的最大错误（或容忍度），开发了减少系统资源使用的算法。但是，在大多数这些工作中，查询容忍度表示为一个数值，可能难以指定。我们观察到，在许多情况下，用户可能并不关心答案的实际值，而是关注哪个对象满足查询要求（例如，“谁是我最近的邻居？”。）特别是，基于实体的查询返回仅满足查询条件的对象的名称。对于这些查询，可以指定“基于非值的公差”。在本文中，我们研究基于分数的公差，这是一种基于非值的公差，其中用户指定查询答案的最大分数，可以是假阳性和假阴性。我们为基于实体的查询的两大类开发基于分数的容限：1）非基于等级的查询（例如，范围查询）和2）基于等级的查询（例如，k最近邻查询）。这些定义为用户提供了另一种选择，以指定答案中允许的最大公差。我们将进一步研究如何在分布式流环境中利用这些定义。我们设计了自适应过滤器算法，该算法允许有条件地在数据流源处删除更新，而不会影响整体查询的正确性。大量的实验结果表明，我们的协议大大减少了网络和能源的使用。

著录项

来源
《Knowledge and Data Engineering, IEEE Transactions on》 |2010年第2期|P.234-248|共15页
作者
Cheng R.; Ben Kao; Kwan A.; Prabhakar S.; Yicheng Tu;
展开▼
作者单位

Univ. of Hong Kong, Hong Kong, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
adaptive filters; information filtering; query processing; adaptive filter algorithms; data stream filtering; data stream management system; distributed stream environment; energy resources; entity-based continuous queries; fraction-based tolerance; network resources; nonrank-based query; query tolerance; rank-based query; Data streams; continuous queries; fraction-based tolerance.;

机译：自适应过滤器;信息过滤;查询处理;自适应过滤器算法;数据流过滤;数据流管理系统;分布式流环境;能源;基于实体的连续查询;基于分数的容忍度;网络资源;基于非秩的查询;查询容忍度基于等级的查询数据流连续查询基于分数的容差;

相似文献

外文文献
中文文献
专利

1. Distributed processing of continuous sliding-window k-NN queries for data stream filtering [J] . Kresimir Pripuzic, Ivana Podnar Zarko, Karl Aberer World Wide Web . 2011,第5a6期

机译：连续滑动窗口k-NN查询的分布式处理，用于数据流过滤
2. PROCESSING CONTINUOUS QUERIES ON SENSOR-BASED MULTIMEDIA DATA STREAMS BY MULTIMEDIA DEPENDENCY ANALYSIS AND ONTOLOGICAL FILTERING [J] . SHI-KUO CHANG, FRANCESCO COLACE, LEI ZHAO, International journal of software engineering and knowledge engineering . 2011,第8期

机译：通过多媒体依赖度分析和本体过滤对基于传感器的多媒体数据流进行连续查询
3. Consistent collective evaluation of multiple continuous queries for filtering heterogeneous data streams [J] . Hyun-Ho Lee, Won-Suk Lee Knowledge and information systems . 2010,第2期

机译：对多个连续查询进行一致的集体评估，以过滤异构数据流
4. Collective Evaluation of Multiple Weighted Continuous Queries for Filtering Data Streams [C] . Hyun-Ho Lee, Won-Suk Lee International conference on information knowledge engineering;IKE'09 . 2009

机译：多个加权连续查询的集体评估，以过滤数据流
5. Class-based continuous query scheduling in data stream management systems [D] . Al Moakar, Lory 2013

机译：数据流管理系统中基于类的连续查询调度
6. Effective Metadata Discovery for Dynamic Filtering of Queries to a Radiology Image Search Engine [O] . Charles E. Kahn Jr. 2008

机译：有效的元数据发现用于对放射图像搜索引擎的查询进行动态过滤
7. Filtering data streams for entity-based continuous queries [O] . Tu Y, Cheng R, Kwan A, 2010

机译：过滤基于实体的连续查询的数据流

Filtering Data Streams for Entity-Based Continuous Queries

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅