首页> 外文学位 >Efficient data management and keyword-based association discovery on graph data of large scale.

【24h】

Efficient data management and keyword-based association discovery on graph data of large scale.

机译：大规模图形数据的高效数据管理和基于关键字的关联发现。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Graph has been widely used in modeling problems in many domains as Bioinformatics, Cheminformatics and the Semantic Web. We target at how to efficiently store and query graph data and how to express and efficiently answer complex search queries.;The existing graph storage and query evaluation techniques mostly store graph data in relational tables and transform graph queries into SQL queries. The mismatch of the rigid relational model and the flexible graph model prevents these techniques from preserving the semantics of graph data, having high storage efficiency and high query efficiency at the same time. We propose to take advantage of the mature storage and query evaluation techniques in the context of semi-structured data and propose to decompose graph data into XML trees to be stored in XML repository. The graph query is transformed into XML queries and evaluated in XML repository. Our experimental results show that the RDF-to-XML decomposition can meet all three criteria. We studied search applications in Bioinformatics, Health informatics and Social Networks. We observed that finding paths satisfying constraints in a graph is critical to these search scenarios. We abstract such search requests and formally define the problem of constraint acyclic path (CAP) discovery. We study how to express CAP queries and propose a new graph query language, constraint SPARQL (cSPARQL), to fulfill the need in expressing CAP search queries, as well as more complex pattern matching search queries cooperating with CAP discovery. We propose efficient algorithms to answer CAP discovery problem: constraint DFS algorithms (cDFS and ecDFS) are based on DFS graph traversal with efficient pruning on search branches; localized Search & Join (S&J) uses the local information to limit the search ranges and perform more effective pruning. We implement the algorithms in a prototype system-Conkar that can be applied to multiple domains, e.g. drug discovery.

机译：在生物信息学，化学信息学和语义网等许多领域，图形已被广泛用于建模问题。我们的目标是如何有效地存储和查询图形数据，以及如何表达和有效地回答复杂的搜索查询。现有的图形存储和查询评估技术大多将图形数据存储在关系表中并将图形查询转换为SQL查询。刚性关系模型和柔性图模型的不匹配阻止了这些技术保留图数据的语义，同时具有高存储效率和高查询效率。我们建议在半结构化数据的上下文中利用成熟的存储和查询评估技术，并建议将图形数据分解为XML树以存储在XML存储库中。图形查询将转换为XML查询，并在XML存储库中进行评估。我们的实验结果表明，从RDF到XML的分解可以满足所有三个条件。我们研究了生物信息学，健康信息学和社交网络中的搜索应用程序。我们观察到，在图中找到满足约束条件的路径对于这些搜索方案至关重要。我们抽象化此类搜索请求，并正式定义约束非循环路径（CAP）发现问题。我们研究了如何表达CAP查询，并提出了一种新的图查询语言约束SPARQL（cSPARQL），以满足表达CAP搜索查询以及与CAP发现配合使用的更复杂的模式匹配搜索查询的需求。我们提出了一种有效的算法来解决CAP发现问题：约束DFS算法（cDFS和ecDFS）基于DFS图遍历，并在搜索分支上进行了有效修剪。本地化搜索与联接（S＆J）使用本地信息来限制搜索范围并执行更有效的修剪。我们在原型系统Conkar中实现了算法，该系统可以应用于多个领域，例如药物发现。

著录项

作者
Zhou, Mo.;
展开▼
作者单位

Indiana University.;

展开▼
授予单位 Indiana University.;
学科 Computer science.
学位 Ph.D.
年度 2014
页码 149 p.
总页数 149
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. An Effective and Efficient Technique for Supporting Privacy-Preserving Keyword-Based Search over Encrypted Data in Clouds [J] . Alfredo Cuzzocrea, Carson K. Leung, Bryan H. Wodi, Procedia Computer Science . 2020,第5期

机译：一种有效且有效的技术，用于支持基于隐私的基于关键字的基于密码的搜索在云中的加密数据
2. Efficient Keyword-Based Searching Strategies for Linked Databases [J] . SRINIVAS VNVSR, SONY KRISHNA R International journal of computer science and network security . 2016,第7期

机译：链接数据库的基于关键字的高效搜索策略
3. Keyword-based private searching on cloud data along with keyword association and dissociation using cuckoo filter [J] . Vora Aishwarya Vipul, Hegde Saumya International Journal of Information Security . 2019,第3期

机译：基于关键字的私人私人搜索云数据以及使用Cuckoo滤波器的关键字协会和解离
4. Efficient Association Discovery with Keyword-based Constraints on Large Graph Data [C] . Mo Zhou, Yifan Pan, Yuqing Wu ACM international conference on information and knowledge management . 2011

机译：大图数据上基于关键字的约束的高效关联发现
5. Relational discovery in sequentially-connected data streams: Efficient algorithms for lossless pattern discovery and change detection. [D] . Coble, Jeffrey Allen. 2005

机译：顺序连接的数据流中的关系发现：用于无损模式发现和更改检测的高效算法。
6. Handling the data management needs of high-throughput sequencing data: SpeedGene a compression algorithm for the efficient storage of genetic data [O] . Dandi Qiao, Wai-Ki Yip, Christoph Lange 2012

机译：处理高通量测序数据的数据管理需求：SpeedGene一种用于有效存储遗传数据的压缩算法
7. Efficient Association Discovery with Keyword-based Constraints on Large Graph Data [O] . Mo Zhou, Yifan Pan, Yuqing Wu 2012

机译：大图数据上基于关键字的约束的有效关联发现

Efficient data management and keyword-based association discovery on graph data of large scale.

摘要

著录项

相似文献

相关主题

期刊订阅