首页> 外文会议>ASIST Annual Meeting >Integrating Log-Based and Text-Based Methods Towards Automatic Web Thesaurus Construction
【24h】

Integrating Log-Based and Text-Based Methods Towards Automatic Web Thesaurus Construction

机译:将基于日志和基于文本的方法集成到自动Web同义词库施工

获取原文

摘要

This paper presents an approach to investigating the possibility for constructing an automatic and scalable thesaurus based on Web users' vocabularies with search interests. The proposed approach mainly includes two techniques, namely, relevant term extraction and concept clustering. The former combines query-session-based and text-based methods to extract relevant terms for a given search term; and the latter organizes these relevant terms into concept classes based on the search results from search engines. Some initial experiments have been conducted to test feasibility of the proposed approach to organizing Web users' vocabularies. The obtained results show that relevant terms could be extracted efficiently and concept classes be more well organized. The approach has a great potential to benefit the automatic construction of a large scale thesaurus for future Web IR applications.
机译:本文介绍了一种调查基于Web用户的词汇表构建自动和可扩展词库的可能性,具有搜索兴趣。所提出的方法主要包括两种技术,即相关术语提取和概念聚类。前者组合了基于查询会话和基于文本的方法,以提取给定的搜索项的相关术语;后者根据搜索引擎的搜索结果将这些相关术语组织成概念类。已经进行了一些初步实验,以测试所提出的方法来组织网络用户词汇表的可行性。所获得的结果表明,可以有效地提取相关术语,概念课程更加良好组织。该方法具有巨大的潜力,使自动构建用于未来的Web IR应用程序的大规模词库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号