A learning framework for information block search based on probabilistic graphical models and Fisher Kernel

Tak-Lam Wong; Haoran Xie; Wai Lam; Fu Lee Wang

首页> 外文期刊>International journal of machine learning and cybernetics >A learning framework for information block search based on probabilistic graphical models and Fisher Kernel

【24h】

A learning framework for information block search based on probabilistic graphical models and Fisher Kernel

机译：基于概率图形模型和Fisher Kernel的信息块搜索的学习框架

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Contrary to traditional Web information retrieval methods that can only return a ranked list of Web pages and only allow search terms in the query, we have developed a novel learning framework for retrieving precise information blocks from Web pages given a query, which may contain some search terms and prior information such as the layout format of the data. There are two challenging sub-tasks for this problem. One challenge is information block detection, where a Web page is automatically segmented into blocks. Another challenge is to find the information blocks relevant to the query. Existing page segmentation methods, which make use of only visual layout information or only content information, do not consider the query information, leading to a solution having conflict with the information need expressed by the query. Our framework aims at modeling the query and the block features to capture both keyword information and prior information via a probabilistic graphical model. Fisher Kernel, which can effectively incorporate the graphical model, is then employed to accomplish the two sub-tasks in a unified manner, optimizing the final goal of block retrieval performance. We have conducted experiments on benchmark datasets and read-world data. Comparisons between existing methods have been conducted to evaluate the effectiveness of our framework.

机译：与传统的Web信息检索方法相反，传统的Web信息检索方法只能返回网页的排序列表，并且只允许查询中使用搜索词，因此，我们开发了一种新颖的学习框架，可以从给定查询的网页中检索精确的信息块，其中可能包含一些搜索条款和先验信息，例如数据的布局格式。这个问题有两个具有挑战性的子任务。挑战之一是信息块检测，其中将网页自动分割为块。另一个挑战是找到与查询相关的信息块。仅使用视觉布局信息或仅使用内容信息的现有页面分割方法不考虑查询信息，从而导致解决方案与查询所表达的信息需求冲突。我们的框架旨在对查询和阻止功能进行建模，以通过概率图形模型捕获关键字信息和先验信息。 Fisher Kernel可以有效地合并图形模型，然后用于以统一的方式完成两个子任务，从而优化了块检索性能的最终目标。我们已经对基准数据集和阅读世界数据进行了实验。现有方法之间的比较已经进行了评估，以评估我们框架的有效性。

著录项

来源
《International journal of machine learning and cybernetics》 |2018年第9期|1473-1487|共15页
作者
Tak-Lam Wong; Haoran Xie; Wai Lam; Fu Lee Wang;
展开▼
作者单位

Department of Mathematics and Information Technology, The Education University of Hong Kong;

Department of Mathematics and Information Technology, The Education University of Hong Kong;

Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong;

School of Computing and Information Sciences, Caritas Institute of Higher Education;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Information extraction; Information block retrieval; Fisher Kernel; Graphical models;

机译：信息提取;信息块检索;鱼核;图形模型;

相似文献

外文文献
中文文献
专利

1. Mouse Movement and Probabilistic Graphical Models Based E-Learning Activity Recognition Improvement Possibilistic Model [J] . Anis Elbahi, Mohamed Nazih Omri, Mohamed Ali Mahjoub, Arabian Journal for Science and Engineering. Section A, Sciences . 2016,第8期

机译：基于电子学习活动识别改进可能性模型的鼠标运动和概率图形模型
2. Towards easier and faster sequence labeling for natural language processing: A search-based probabilistic online learning framework (SAPO) [J] . Sun Xu, Ma Shuming, Zhang Yi, Information Sciences: An International Journal . 2019,第期

机译：为了更轻松，更快地序列标记用于自然语言处理：基于搜索的概率在线学习框架（SAPO）
3. Modeling and optimization of biodiesel engine performance using kernel-based extreme learning machine and cuckoo search [J] . Pak Kin Wong, Ka In Wong, Chi Man Vong, Renewable energy . 2015,第feba期

机译：使用基于内核的极限学习机和布谷鸟搜索对生物柴油引擎性能进行建模和优化
4. Calculating blocking probabilities for loss networks based on probabilistic graphical models [C] . Jian Ni, Tatikonda S. Information Theory, 2005. ISIT 2005. Proceedings. International Symposium on . 2005

机译：基于概率图形模型计算损失网络的阻塞概率
5. Multilevel Context-Aware Software Architecture Decision Framework with Probabilistic Graphical Models. [D] . Petrov, Plamen P. 2014

机译：具有概率图形模型的多级上下文感知软件体系结构决策框架。
6. A Common-Ground Review of the Potential for Machine Learning Approaches in Electrocardiographic Imaging Based on Probabilistic Graphical Models [O] . Jaume Coll-Font, Linwei Wang, Dana H Brooks -1

机译：基于概率图形模型的心电图机器学习方法潜力的通用综述
7. Calculating Blocking Probabilities for Loss Networks Based on Probabilistic Graphical Models [O] . Jian Ni, Sekhar Tatikonda 2013

机译：基于概率图形模型的损失网络阻塞概率计算

A learning framework for information block search based on probabilistic graphical models and Fisher Kernel

摘要

著录项

相似文献

相关主题

期刊订阅