【24h】

OWLStats: Distributed Computation of OWL Dataset Statistics

机译:owlstats:owl数据集统计信息的分布式计算

获取原文

摘要

Nowadays, ontologies are used in various application areas, involving Artificial Intelligence, Natural Language Processing, Data Integration, and Knowledge Management. It is essential to know the internal structure, distribution, and coherence of the published datasets to make it easier to reuse, interlink, integrate, infer, or query. Therefore, there is a pressing need to obtain a clear view of OWL datasets became more prevalent. In this paper, we present OWLStats, a software component for computing statistical information about large scale OWL datasets in a distributed manner. We present the primary distributed in-memory approach for computing 32 different statistical criteria for OWL datasets utilizing Apache Spark, which can scale horizontally to a cluster of machines. OWLStats has been integrated into the SANSA framework. The preliminary results prove that OWLStats is linearly scalable in terms of data scalability.
机译:如今,在各种应用领域中使用了本体,涉及人工智能,自然语言处理,数据集成和知识管理。 必须知道已发布的数据集的内部结构,分发和一致性,以便更容易地重用,互连,集成,推断或查询。 因此,需要迫切需要获得猫头鹰数据集的清晰视图变得更加普遍。 在本文中,我们呈现OWLSTATS,一种软件组件,用于以分布式方式计算有关大规模OWL数据集的统计信息。 我们介绍了使用Apache Spark计算owl数据集的32个不同统计标准的主要分布式内存方法,它可以水平扩展到一组机器。 owlstats已集成到SANSA框架中。 初步结果证明,在数据可扩展性方面,OWLSTATS是线性可扩展性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号