...
首页> 外文期刊>International journal of machine learning and cybernetics >A learning framework for information block search based on probabilistic graphical models and Fisher Kernel
【24h】

A learning framework for information block search based on probabilistic graphical models and Fisher Kernel

机译:基于概率图形模型和Fisher Kernel的信息块搜索的学习框架

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Contrary to traditional Web information retrieval methods that can only return a ranked list of Web pages and only allow search terms in the query, we have developed a novel learning framework for retrieving precise information blocks from Web pages given a query, which may contain some search terms and prior information such as the layout format of the data. There are two challenging sub-tasks for this problem. One challenge is information block detection, where a Web page is automatically segmented into blocks. Another challenge is to find the information blocks relevant to the query. Existing page segmentation methods, which make use of only visual layout information or only content information, do not consider the query information, leading to a solution having conflict with the information need expressed by the query. Our framework aims at modeling the query and the block features to capture both keyword information and prior information via a probabilistic graphical model. Fisher Kernel, which can effectively incorporate the graphical model, is then employed to accomplish the two sub-tasks in a unified manner, optimizing the final goal of block retrieval performance. We have conducted experiments on benchmark datasets and read-world data. Comparisons between existing methods have been conducted to evaluate the effectiveness of our framework.
机译:与传统的Web信息检索方法相反,传统的Web信息检索方法只能返回网页的排序列表,并且只允许查询中使用搜索词,因此,我们开发了一种新颖的学习框架,可以从给定查询的网页中检索精确的信息块,其中可能包含一些搜索条款和先验信息,例如数据的布局格式。这个问题有两个具有挑战性的子任务。挑战之一是信息块检测,其中将网页自动分割为块。另一个挑战是找到与查询相关的信息块。仅使用视觉布局信息或仅使用内容信息的现有页面分割方法不考虑查询信息,从而导致解决方案与查询所表达的信息需求冲突。我们的框架旨在对查询和阻止功能进行建模,以通过概率图形模型捕获关键字信息和先验信息。 Fisher Kernel可以有效地合并图形模型,然后用于以统一的方式完成两个子任务,从而优化了块检索性能的最终目标。我们已经对基准数据集和阅读世界数据进行了实验。现有方法之间的比较已经进行了评估,以评估我们框架的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号