Ontology-Based Semantic Search for Open Government Data

机译：基于本体的开放政府数据语义搜索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Open data are increasingly available in amount, but often with unprecise or incomplete description. It is time consuming and difficult to discover relevant datasets. Current open data catalogues provide mostly keyword-based search without the ability to understand the user's intent and the contextual meaning of the datasets. Ontology-based semantic search has been well explored in semantic web as an attempt to improve the quality of search for relevant documents and web pages. This paper applies semantic and machine learning technologies to open data. It presents an approach for search of open government datasets, a relatively underexplored domain, where the semantics of data relies on metadata that describes the data. The idea is to link the published datasets with concepts from a well-defined ontology and allow searching based on hybrid indexing. A simplified ontology for the transport domain is constructed to demonstrate and test the idea. A prototype search engine has been implemented which supports both manual and automatic linking to concepts in the ontology and exploits hybrid indexing based on these linking methods. Natural language processing (NLP) techniques are applied to dataset linking and indexing and enable the independency of the natural language used for describing the datasets. The manual linking of datasets to ontology concepts is intended for domain experts and data publishers, while the automatic linking is based on the provided dataset descriptions. The automatic linking reduces the overhead of manual concepts linking and the dependency on domain experts. Preliminary results have indicated that semantic search based on ontologies is a promising approach to increase search quality and efficiency for open data search. The success of the automatic mechanism does however depend on the quality and comprehensiveness of the dataset descriptions.

机译：开放数据的数量越来越多，但描述往往不准确或不完整。这是耗时的并且难以发现相关的数据集。当前的开放数据目录主要提供基于关键字的搜索，而无法理解用户的意图和数据集的上下文含义。在语义网中，基于本体的语义搜索已经得到了很好的探索，以提高相关文档和网页的搜索质量。本文将语义和机器学习技术应用于开放数据。它提出了一种搜索开放政府数据集的方法，这是一个相对未开发的领域，其中数据的语义依赖于描述数据的元数据。这个想法是将已发布的数据集与定义良好的本体中的概念链接起来，并允许基于混合索引进行搜索。构建了用于传输域的简化本体，以演示和测试该思想。已经实现了原型搜索引擎，该引擎支持手动和自动链接到本体中的概念，并基于这些链接方法利用混合索引。自然语言处理（NLP）技术已应用于数据集链接和索引编制，并实现了用于描述数据集的自然语言的独立性。手动将数据集链接到本体概念供领域专家和数据发布者使用，而自动链接则基于提供的数据集描述。自动链接减少了手动概念链接的开销以及对领域专家的依赖。初步结果表明，基于本体的语义搜索是一种提高开放数据搜索质量和效率的有前途的方法。但是，自动机制的成功确实取决于数据集描述的质量和全面性。

著录项

来源
《IEEE International Conference on Semantic Computing》|2019年|7-15|共9页
会议地点
作者
Shanshan Jiang; Thomas F. Hagelien; Marit Natvig; Jingyue Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Ontologies; Semantic search; Indexing; Search engines; Engines; Prototypes;

机译：本体;语义搜索;索引;搜索引擎;引擎;原型;
入库时间 2022-08-26 13:53:16

相似文献

外文文献
中文文献
专利

1. An ontology-based semantic similarity metric to empower semantic search [J] . Suraiya Parveen, Ranjit Biswas International Journal of Engineering & Technology . 2018,第4期

机译：基于本体的语义相似度度量，可实现语义搜索
2. Improving the search process through ontology-based adaptive semantic search [J] . Chyan Yang, Keng-Chieh Yang, Hsu-Chieh Yuan The Electronic Library . 2007,第2期

机译：通过基于本体的自适应语义搜索改善搜索过程
3. A Novel Quranic Search Engine Using an Ontology-Based Semantic Indexing [J] . Samia Zouaoui, Khaled Rezeg Arabian Journal for Science and Engineering . 2021,第4期

机译：一种新颖的古兰经搜索引擎，使用基于本体的语义索引
4. Ontology-Based Semantic Search for Open Government Data [C] . Shanshan Jiang, Thomas F. Hagelien, Marit Natvig, IEEE International Conference on Semantic Computing . 2019

机译：基于本体的语义搜索开放式政府数据
5. Ontology-based Semantic Harmonization of HIV-associated Common Data Elements for Integration of Diverse HIV Research Datasets. [D] . Brown, William, III. 2016

机译：基于本体的与HIV相关的通用数据元素的语义协调，用于整合不同的HIV研究数据集。
6. Finding semantic patterns in omics data using concept rule learning with an ontology-based refinement operator [O] . František Malinka, Filip železný, Jiří Kléma 2020

机译：使用基于本体的细化操作员使用概念规则学习在OMICS数据中找到语义模式
7. Ontology-based faceted semantic search with automatic sense disambiguation for bioenergy domain [O] . Pathmeswaran Raju, Lynsey Melville, Craig Chapman, 2018

机译：基于本体的刻面语义搜索，具有生物能源域的自动意义歧义

Ontology-Based Semantic Search for Open Government Data

摘要

著录项

相似文献

相关主题

期刊订阅