首页> 外文期刊>International Journal of Geographical Information Science >A comprehensive methodology for discovering semantic relationships among geospatial vocabularies using oceanographic data discovery as an example
【24h】

A comprehensive methodology for discovering semantic relationships among geospatial vocabularies using oceanographic data discovery as an example

机译:以海洋数据发现为例的发现地理空间词汇之间语义关系的综合方法

获取原文
获取原文并翻译 | 示例
           

摘要

It is challenging to find relevant data for research and development purposes in the geospatial big data era. One long-standing problem in data discovery is locating, assimilating and utilizing the semantic context for a given query. Most research in the geospatial domain has approached this problem in one of two ways: building a domain-specific ontology manually or discovering automatically, semantic relationships using metadata and machine learning techniques. The former relies on rich expert knowledge but is static, costly and labor intensive, whereas the second is automatic and prone to noise. An emerging trend in information science takes advantage of large-scale user search histories, which are dynamic but subject to user-and crawler-generated noise. Leveraging the benefits of these three approaches and avoiding their weaknesses, a novel methodology is proposed to (1) discover vocabulary-based semantic relationships from user search histories and clickstreams, (2) refine the similarity calculation methods from existing ontologies and (3) integrate the results of ontology, metadata, user search history and clickstream analysis to better determine their semantic relationships. An accuracy assessment by domain experts for the similarity values indicates an 83% overall accuracy for the top 10 related terms over randomly selected sample queries. This research functions as an example for building vocabulary-based semantic relationships for different geographical domains to improve various aspects of data discovery, including the accuracy of the vocabulary relationships of commonly used search terms.
机译:在地理空间大数据时代,为研究和开发目的寻找相关数据具有挑战性。数据发现中的一个长期问题是查找,吸收和利用给定查询的语义上下文。地理空间领域中的大多数研究已通过以下两种方式之一解决了该问题:手动构建或自动发现特定领域的本体,使用元数据和机器学习技术的语义关系。前者依靠丰富的专业知识,但是它是静态的,昂贵的和劳动密集的,而第二种是自动的并且容易产生噪音。信息科学的新兴趋势利用了大规模的用户搜索历史记录,这些历史记录是动态的,但会受到用户和爬网程序生成的噪声的影响。利用这三种方法的优点并避免它们的缺点,提出了一种新颖的方法:(1)从用户搜索历史和点击流中发现基于词汇的语义关系;(2)从现有本体中提炼相似度计算方法;(3)整合本体,元数据,用户搜索历史和点击流分析的结果,以更好地确定其语义关系。领域专家对相似度值进行的准确性评估表明,与随机选择的样本查询相比,前10个相关词语的整体准确性为83%。这项研究以为不同地理域建立基于词汇的语义关系以改善数据发现的各个方面(包括常用搜索词的词汇关系的准确性)为例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号