首页> 外文期刊>Computing and informatics >Intelligent Support for Information Retrieval of Web Documents
【24h】

Intelligent Support for Information Retrieval of Web Documents

机译:智能支持Web文档的信息检索

获取原文
       

摘要

The main goal of this research was to investigate the means of intelligent support for retrieval of web documents. We have proposed the architecture of the web tool system --- Trillian, which discovers the interests of users without their interaction and uses them for autonomous searching of related web content. Discovered pages are suggested to the user. The discovery of user interests is based on analysis of documents visited by the users previously. We have created a module for completely transparent tracking of the user's movement on the web, which logs both visited URLs and contents of web pages. The post analysis step is based on a variant of the suffix tree clustering algorithm. We primarily focus on overall Trillian architecture design and the process of discovering topics of interests. We have implemented an experimental prototype of Trillian and evaluated the quality, speed and usefulness of the proposed system. We have shown that clustering is a feasible technique for extraction of interests from web documents. We consider the proposed architecture to be quite promising and suitable for future extensions.
机译:本研究的主要目标是调查智能支持对Web文件的方法。我们提出了Web工具系统的架构--- Trillian,它发现了用户的兴趣,而无需它们的交互,并使用它们来自主搜索相关的Web内容。发现的页面是向用户建议的。用户兴趣的发现是基于对用户前访问的文档的分析。我们创建了一个模块,用于完全透明地跟踪用户在Web上的移动,它会记录访问的URL和网页内容。后分析步骤基于后缀树聚类算法的变型。我们主要专注于整体Trillian建筑设计和发现兴趣主题的过程。我们已经实施了Trillian的实验原型,并评估了所提出的系统的质量,速度和有用性。我们已经表明,聚类是一种可行的技术,用于从Web文档中提取利益。我们认为拟议的架构非常有希望,适合未来的扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号