【24h】

Small Business-Oriented Index Construction of Cloud Data

机译:面向小型企业的云数据索引构建

获取原文

摘要

With the development of cloud computing, data owners (businesses and individuals) are motivated to outsource their local complex database systems to public cloud for flexibility and economic savings. But for the consideration of user's privacy, personal data has to be special treatment locally before outsourcing to the cloud server. Considering the large number of data users and documents in cloud, it is crucial for data owner to construct an index for their data collection, which increases the cost of the data owner. Related works focus on the searches on encrypted database but rarely consider the overhead of the index construction for data owner and the extensions of the index. Although traditional index construction methods of information retrieval have been widely studied, direct application of these methods would not be necessarily suitable for our scenario. Thus, enabling an efficient index construction service is of paramount. In this paper, we define and solve the problem of index construction on small business (SBIC). Among various index methods, we choose inverted index method. An inverted index is an index data structure storing a mapping from content to its locations in a set of documents. The purpose of it is to allow fast full text searches.We firstly propose a basic SBIC scheme using Lucene (an open source project for web search engine), and then significantly improve it to meet efficient keyword extraction requirement and multi-type files demand. Thorough analysis design goals(see section 2.3) of proposed schemes is given, extensive experimental results on the dataset further show proposed scheme indeed introduce low overhead on time and space.
机译:随着云计算的发展,激励数据所有者(企业和个人)将其本地复杂数据库系统外包给公共云,以提高灵活性并节省经济。但是出于用户隐私的考虑,在外包到云服务器之前,必须在本地对个人数据进行特殊处理。考虑到云中大量的数据用户和文档,对于数据所有者而言,为其数据收集构建索引至关重要,这将增加数据所有者的成本。相关工作集中在对加密数据库的搜索上,但很少考虑数据所有者的索引构造和索引扩展的开销。尽管传统的信息检索索引构建方法已得到广泛研究,但是这些方法的直接应用不一定适合我们的情况。因此,实现有效的索引构建服务至关重要。在本文中,我们定义并解决了小型企业指标构建(SBIC)的问题。在各种索引方法中,我们选择倒排索引方法。反向索引是一种索引数据结构,用于存储从内容到其在一组文档中的位置的映射。它的目的是允许快速的全文本搜索。我们首先提出一种使用Lucene(Web搜索引擎的开源项目)的基本SBIC方案,然后对其进行显着改进,以满足有效的关键字提取要求和多类型文件的需求。给出了拟议方案的详尽分析设计目标(请参见第2.3节),数据集上的大量实验结果进一步表明,所提出的方案确实在时间和空间上引入了较低的开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号