首页> 外文会议>7th Web Information Systems and Applications Conference >Dynamically Constructing a Global Schema for Web Entities
【24h】

Dynamically Constructing a Global Schema for Web Entities

机译:动态构造Web实体的全局架构

获取原文

摘要

With the rapid development of the Internet, popular entities have more and more instances on the Web. It is observed that, on one hand, for the same Web entity, different Web entity instances often contain different attributes, and for the same attribute, different Web entity instances often use different labels; on the other, new Web entity instances which contain new attributes and labels are appearing on the Web. Therefore, it is difficult to dynamically construct a global schema for the Web entities of a given entity type, although the global schema is highly desired in Web entity instances detection, extraction and integration. In this paper, we propose a novel approach to dynamically construct a global schema for the Web entities of a given entity type. First, a SVM (support vector machine) classification model is built based on the Web entity instances which have been extracted from related Web pages. Then, based on this model, a global schema discovery approach is provided to dynamically construct the global schema for target entity type. Experimental results on the Chinese Web sites show that the approach is general and effective.
机译:随着Internet的快速发展,流行的实体在Web上拥有越来越多的实例。可以看到,一方面,对于同一Web实体,不同的Web实体实例通常包含不同的属性,对于同一属性,不同的Web实体实例通常使用不同的标签;对于同一属性,不同的Web实体实例通常使用不同的标签。另一方面,包含新属性和标签的新Web实体实例出现在Web上。因此,尽管在Web实体实例检测,提取和集成中非常需要全局架构,但是很难为给定实体类型的Web实体动态构建全局架构。在本文中,我们提出了一种新颖的方法来动态构造给定实体类型的Web实体的全局架构。首先,基于从相关网页中提取的Web实体实例,构建SVM(支持向量机)分类模型。然后,基于此模型,提供了一种全局模式发现方法来动态构造目标实体类型的全局模式。在中文网站上的实验结果表明,该方法是通用且有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号