Spatial data (e.g., points) have extensive applications in practice, such as spatial databases, Location-Based Services, spatial computing, social analyses, computational geometry, graph design, medical imaging, etc. Geometric queries, such as geometric range queries (i.e., finding points inside a geometric range) and nearest neighbor queries (i.e., finding the closest point to a given point), are fundamental primitives to analyze and retrieve information over spatial data. For example, a medical researcher can query a spatial dataset to collect information about patients in a certain geometric area to predict whether there will be a dangerous outbreak of a particular disease (e.g., Ebola or Zika). udWith the dramatic increase on the scale and size of data, many companies and organizations are outsourcing significant amounts of data, including significant amounts of spatial data, to public cloud data services in order to minimize data storage and query processing costs. For instance, major companies and organizations, such as Yelp, Foursquare and NASA, are using Amazon Web Services as their public cloud data services, which can save billions of dollars per year for those companies and organizations. However, due to the existence of attackers (e.g., a curious administrator or a hacker) on remote servers, users are worried about the leakage of their private data while storing and querying those data on public clouds. udSearchable Encryption (SE) is an innovative technique to protect the data privacy of users on public clouds without losing search functionalities on the server side. Specifically, a user can encrypt its data with SE before outsourcing data to a public server, and this public server is able to search encrypted data without decryption. Many SE schemes have been proposed to support simple queries, such as keyword search. Unfortunately, how to efficiently and securely support geometric queries over encrypted spatial data remains open. udIn this dissertation, to protect the privacy of spatial data in public clouds while still maintaining search functions without decryption, we propose a set of new SE solutions to support geometric queries, including geometric range queries and nearest neighbor queries, over encrypted spatial data. The major contributions of this dissertation focus on two aspects. First, we enrich search functionalities by designing new solutions to carry out secure fundamental geometric search queries, which were not supported in previous works. Second, we minimize the performance gap between theory and practice by building novel schemes to perform geometric queries with highly efficient search time and updates over large-scale encrypted spatial data. udSpecifically, we first design a scheme supporting circular range queries (i.e., retrieving points inside a circle) over encrypted spatial data. Instead of directly evaluating compute-then-compare operations, which are inefficient over encrypted data, we use a set of concentric circles to represent a circular range query, and then verify whether a data point is on any of those concentric circles by securely evaluating inner products over encrypted data. udNext, to enrich search functionalities, we propose a new scheme, which can support arbitrary geometric range queries, such as circles, triangles and polygons in general, over encrypted spatial data. By leveraging the properties of Bloom filters, we convert a geometric range search problem to a membership testing problem, which can be securely evaluated with inner products. Moving a step forward, we also build another new scheme, which not only supports arbitrary geometric range queries and sub-linear search time but also enables highly efficient updates. udFinally, we address the problem of secure nearest neighbor search on encrypted large-scale datasets. Specifically, we modify the algorithm of nearest neighbor search in advanced tree structures (e.g., R-trees) by simplifying operations, where evaluating comparisons alone on encrypted data is sufficient to efficiently and correctly find nearest neighbors over datasets with millions of tuples.
展开▼
机译:空间数据(例如点)在实践中具有广泛的应用,例如空间数据库,基于位置的服务,空间计算,社会分析,计算几何,图形设计,医学成像等。几何查询,例如几何范围查询(即(找到几何范围内的点)和最近的邻居查询(即,找到与给定点的最近点),是用于分析和检索空间数据信息的基本原语。例如,医学研究人员可以查询空间数据集,以收集有关特定几何区域内患者的信息,以预测特定疾病(例如埃博拉病毒或寨卡病毒)是否会爆发危险。 ud随着数据规模和大小的急剧增长,许多公司和组织正在将大量数据(包括大量空间数据)外包给公共云数据服务,以最大程度地减少数据存储和查询处理成本。例如,Yelp,Foursquare和NASA等主要公司和组织正在将Amazon Web Services用作其公共云数据服务,这可以为这些公司和组织每年节省数十亿美元。但是,由于远程服务器上存在攻击者(例如,好奇的管理员或黑客),用户担心在公共云上存储和查询这些数据时会泄漏其私有数据。 ud可搜索加密(SE)是一项创新技术,可在不丢失服务器端搜索功能的情况下保护公共云上用户的数据隐私。具体而言,用户可以在将数据外包给公共服务器之前使用SE对其数据进行加密,并且该公共服务器能够搜索加密的数据而无需解密。已经提出了许多SE方案来支持简单的查询,例如关键字搜索。不幸的是,如何有效地和安全地支持对加密的空间数据的几何查询仍然存在。 ud在本文中,为了保护公共云中空间数据的私密性,同时仍保持不解密的搜索功能,我们提出了一套新的SE解决方案,以支持对加密的空间数据进行几何查询,包括几何范围查询和最近邻查询。本文的主要贡献集中在两个方面。首先,我们通过设计新的解决方案来执行安全的基本几何搜索查询,从而丰富了搜索功能,而以前的工作中不支持这种查询。其次,我们通过建立新颖的方案以高效的搜索时间执行几何查询并在大规模加密的空间数据上进行更新来最大程度地减少理论与实践之间的性能差距。 ud具体来说,我们首先设计一种方案,以支持对加密的空间数据进行循环范围查询(即检索圆内的点)。我们没有直接评估对加密数据效率低下的“计算-然后-比较”操作,而是使用一组同心圆来表示圆范围查询,然后通过安全地评估内部数据来验证数据点是否在这些同心圆中的任何一个上产品加密数据。 ud接下来,为了丰富搜索功能,我们提出了一种新方案,该方案可以支持对加密的空间数据进行任意几何范围查询,例如通常是圆形,三角形和多边形。通过利用布隆过滤器的属性,我们将几何范围搜索问题转换为成员资格测试问题,可以使用内部产品安全地对其进行评估。向前迈进了一步,我们还构建了另一个新方案,该方案不仅支持任意几何范围查询和亚线性搜索时间,而且还可以实现高效的更新。 ud最后,我们解决了在加密的大型数据集上进行安全最近邻搜索的问题。具体而言,我们通过简化操作来修改高级树结构(例如R树)中的最近邻居搜索算法,其中仅对加密数据进行比较评估就足以有效,正确地在具有数百万个元组的数据集上找到最近邻居。
展开▼