首页> 外文会议>2017 Intelligent Systems Conference >Towards intelligent open data platforms: Discovering relatedness in datasets
【24h】

Towards intelligent open data platforms: Discovering relatedness in datasets

机译:迈向智能开放数据平台:发现数据集中的相关性

获取原文
获取原文并翻译 | 示例

摘要

Open data platforms are central to the management and exploitation of data ecosystems. While existing platforms provide basic search capabilities and features for filtering search results, none of the existing platforms provide recommendations on related datasets. Knowledge of dataset relatedness is critical for determining datasets that can be mashed-up or integrated for the purpose of analysis and creation of data-driven services. When considering data platforms, such as data.gov with over 193,000 datasets or data.gv.uk with over 40,000 datasets, specifying dataset relatedness relationship manually is infeasible. In this paper, we approach the problem of discovering relatedness in datasets by employing the Kohonen Self Organsing Map (SOM) algorithm to analyze the metadata extracted from the Data Catalogue maintained on a platform. Our results show that this approach is very effective in discovering relatedness relationships among datasets. Findings also reveal that our approach could uncover interesting and valuable connections among domains of the datasets which could be further exploited for designing smarter data-driven services.
机译:开放数据平台对于数据生态系统的管理和开发至关重要。现有平台提供基本的搜索功能和用于过滤搜索结果的功能,而现有平台均未提供有关相关数据集的建议。数据集相关性的知识对于确定可以进行混搭或集成以分析和创建数据驱动服务的目的的数据集至关重要。当考虑数据平台时,例如具有超过193,000个数据集的data.gov或具有超过40,000个数据集的data.gv.uk,手动指定数据集相关性关系是不可行的。在本文中,我们通过使用Kohonen自组织图(SOM)算法来分析从平台上维护的数据目录中提取的元数据,从而解决了在数据集中发现关联性的问题。我们的结果表明,这种方法在发现数据集之间的相关性关系方面非常有效。研究结果还表明,我们的方法可以发现数据集各个域之间有趣且有价值的联系,这些联系可以进一步用于设计更智能的数据驱动服务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号