首页> 外文会议>International conference on theory and practice of digital libraries >How Linked Data can Aid Machine Learning-Based Tasks
【24h】

How Linked Data can Aid Machine Learning-Based Tasks

机译:链接数据如何帮助基于机器学习的任务

获取原文

摘要

The discovery of useful data for a given problem is of primary importance since data scientists usually spend a lot of time for discovering, collecting and preparing data before using them for various reasons, e.g., for applying or testing machine learning algorithms. In this paper we propose a general method for discovering, creating and selecting, in an easy way, valuable features describing a set of entities for leveraging them in a machine learning context. We demonstrate the feasibility of this approach by introducing a tool (research prototype), called LODsyndesis_(MC) which is based on Linked Data technologies, that (a) discovers automatically datasets where the entities of interest occur, (b) shows to the user a big number of useful features for these entities, and (c) creates automatically the selected features by sending SPARQL queries. We evaluate this approach by exploiting data from several sources, including British National Library, for creating datasets in order to predict whether a book or a movie is popular or non-popular. Our evaluation contains a 5-fold cross validation and we introduce comparative results for a number of different features and models. The evaluation showed that the additional features did improve the accuracy of prediction.
机译:对于给定问题而言,发现有用数据至关重要,因为数据科学家通常出于各种原因(例如,应用或测试机器学习算法)而在使用它们之前花费大量时间来发现,收集和准备数据。在本文中,我们提出了一种通用方法,以一种简单的方式来发现,创建和选择有价值的特征,这些有价值的特征描述了一组实体,以便在机器学习环境中利用它们。我们通过引入一种称为LODsyndesis_(MC)的工具(研究原型)来证明这种方法的可行性,该工具基于链接数据技术,该工具(a)自动发现感兴趣的实体所在的数据集,(b)向用户显示这些实体具有大量有用的功能,并且(c)通过发送SPARQL查询自动创建选定的功能。我们通过利用包括英国国家图书馆在内的多个来源的数据来评估此方法,以创建数据集,以预测一本书或一部电影是受欢迎还是不受欢迎。我们的评估包含5倍交叉验证,我们介绍了许多不同功能和模型的比较结果。评估表明,其他功能确实提高了预测的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号