首页> 外文会议>Asian Semantic Web Conference >Catriple: Extracting Triples from Wikipedia Categories
【24h】

Catriple: Extracting Triples from Wikipedia Categories

机译:Catriple:从维基百科类分类中提取三元组

获取原文

摘要

As an important step towards bootstrapping the Semantic Web, many efforts have been made to extract triples from Wikipedia because of its wide coverage, good organization and rich knowledge. One kind of important triples is about Wikipedia articles and their non-isa properties, e.g. (Beijing, country, China). Previous work has tried to extract such triples from Wikipedia infoboxes, article text and categories. The infobox-based and text-based extraction methods depend on the infoboxes and suffer from a low article coverage. In contrast, the category-based extraction methods exploit the widespread categories. However, they rely on predefined properties, which is too effort-consuming and explores only very limited knowledge in the categories. This paper automatically extracts properties and triples from the less explored Wikipedia categories so as to achieve a wider article coverage with less manual effort. We manage to realize this goal by utilizing the syntax and semantics brought by super-sub category pairs in Wikipedia. Our prototype implementation outputs about 10M triples with a 12-level confidence ranging from 47.0% to 96.4%, which cover 78.2% of Wikipedia articles. Among them, 1.27M triples have confidence of 96.4%. Applications can on demand use the triples with suitable confidence.
机译:作为对语义网络引导引导的重要步骤,由于其广泛的覆盖范围,良好的组织和丰富的知识,因此已经提取了许多努力来从维基百科提取三元。一种重要的三元组织是关于维基百科文章及其非ISA属性,例如维基百科物业。 (北京,乡村)。以前的工作试图从维基百科信息框,文章文本和类别中提取此类三元组。基于信息框的和基于文本的提取方法依赖于纵向框架并遭受低文章覆盖率。相比之下,基于类别的提取方法利用广泛类别。然而,他们依赖于预定义的属性,这太努力消耗,探索了类别中的非常有限的知识。本文自动从较少探索的维基百科类别中提取属性和三元组,以实现更广泛的文章覆盖,较少的手动努力。我们可以通过利用维基百科的超子类别对所带来的语法和语义来实现这一目标。我们的原型实施输出大约10米的三倍,12级置信范围从47.0%到96.4%,占维基百科文章的78.2%。其中,1.27米的三元人有96.4%的置信度。应用程序可以随时使用具有适当的自信的三元组。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号