首页> 外文会议>Big data >A Learning Classifier-Based Approach to Aligning Data Items and Labels
【24h】

A Learning Classifier-Based Approach to Aligning Data Items and Labels

机译:基于学习分类器的数据项和标签对齐方法

获取原文
获取原文并翻译 | 示例

摘要

Web databases are now pervasive. Query result pages are dynamically generated from these databases in response to user-submitted queries. A query result page contains a number of data records, each of which consists of data items and their labels. In this paper, we focus on the data alignment problem, in which individual data items and labels from different data records on a query page are aligned into separate columns, each representing a group of semantically similar data items or labels from each of these data records. We present a new approach to the data alignment problem, in which learning classifiers are trained using supervised learning to align data items and labels. Previous approaches to this problem have relied on heuristics and manually-crafted rules, which are difficult to be adapted to new page layouts and designs. In contrast we are motivated to develop learning classifiers which can be easily adapted. We have implemented the proposed learning classifier-based approach in a software prototype, rAUgner, and our experimental results have shown that the approach is highly effective.
机译:Web数据库现在无处不在。响应于用户提交的查询,从这些数据库动态生成查询结果页面。查询结果页面包含许多数据记录,每个数据记录都由数据项及其标签组成。在本文中,我们关注于数据对齐问题,其中查询页面上来自不同数据记录的单个数据项和标签被对齐到单独的列中,每个列代表来自每个数据记录的一组语义相似的数据项或标签。我们提出了一种解决数据对齐问题的新方法,其中使用监督学习训练学习分类器以对齐数据项和标签。解决此问题的先前方法依赖于试探法和手工制定的规则,这些规则很难适应新的页面布局和设计。相反,我们有动力开发易于调整的学习分类器。我们已经在软件原型rAUgner中实施了建议的基于学习分类器的方法,我们的实验结果表明该方法非常有效。

著录项

  • 来源
    《Big data》|2013年|282-291|共10页
  • 会议地点 Oxford(GB)
  • 作者

    Neil Anderson; Jun Hong;

  • 作者单位

    School of Electronics, Electrical Engineering and Computer Science Queen's University Belfast, UK;

    School of Electronics, Electrical Engineering and Computer Science Queen's University Belfast, UK;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号