首页> 外文会议>Workshop on Collective Intelligence on Semantic Web >Semiautomatic Extraction of Topic Maps from Web Pages Using Clustering with Web Contents and Structure
【24h】

Semiautomatic Extraction of Topic Maps from Web Pages Using Clustering with Web Contents and Structure

机译:使用Web内容和结构群集的网页从网页中的半自动提取

获取原文

摘要

In this paper, we describe a method to semi-automatically extract Topic Maps from a set of Web pages. We introduce the following two points to the existing clustering method: The first is merging only the linked Web pages, to extract the underlying relationship of the topics. The second is introducing the similarity by contents of Web pages and the types of links, and the distance between the directories in which the pages are located, to generate dense clusters. We generate the topic map by assuming the clusters as topics, the edges as associations, the Web pages related to the topic as occurrences from the result of clustering. We experimentally extracted the topic map and evaluated it.
机译:在本文中,我们描述了来自一组网页的半自动提取主题映射的方法。我们介绍了现有聚类方法的以下两点:第一个是仅合并链接的网页,以提取主题的基础关系。第二个是通过网页的内容和链路类型的相似性,以及页面所在的目录之间的距离,以产生密集的簇。我们通过假设群集作为主题,边缘作为关联,与群集结果发生的网页来生成主题映射。我们通过实验提取了主题地图并进行了评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号