首页> 外文会议>第十七届国际万维网大会(the 17th International World Wide Web Conference)(WWW08)论文集 >Representing a Web Page as Sets of Named Entities of Multiple Types – A Model and Some Preliminary Applications
【24h】

Representing a Web Page as Sets of Named Entities of Multiple Types – A Model and Some Preliminary Applications

机译:将网页表示为多种类型的命名实体集–模型和一些初步应用

获取原文

摘要

As opposed to representing a document as a “bag of words” in most information retrieval applications, we propose a model of representing a web page as sets of named entities of multiple types. Specifically, four types of named entities are extracted, namely person, geographic location, organization, and time. Moreover, the relations among these entities are also extracted, weighted, classified and marked by labels. On top of this model, some interesting applications are demonstrated. In particular, we introduce a notion of person-activity, which contains four different elements: person, location, time and activity. With this notion and based on a reasonably large set of web pages, we are able to show how one person's activities can be attributed by time and location, which gives a good idea of the mobility of the person under question.
机译:与在大多数信息检索应用程序中将文档表示为“单词袋”相反,我们提出了一种将网页表示为多种类型的命名实体的模型。具体来说,提取四种类型的命名实体,即人,地理位置,组织和时间。此外,这些实体之间的关系也被提取,加权,分类并用标签标记。在此模型之上,展示了一些有趣的应用程序。特别是,我们引入了“人与活动”的概念,其中包含四个不同的元素:人,位置,时间和活动。有了这个概念,并基于一组相当大的网页,我们就可以说明如何通过时间和位置来归因于一个人的活动,这很好地说明了所讨论的人的活动能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号