首页> 外文会议>International Multi-Topic Conference >Population of data in web-tables schema
【24h】

Population of data in web-tables schema

机译:Web表格架构中的数据填充

获取原文

摘要

Tabular data is an existing source of information available on the web. We have started working on collection of HTML tables taken from the web. Firstly good quality tables will be identified then schema matching is done. Schema Matching identifies the number of correspondences which determines the similar elements from two different schemas. Columns and data values are compared one after the other to match schema. While searching for tabular data on the web search engine may return URL instead of returning tabular data which is main issue. So we are working on this issue we extracted data of tabular web-pages and extracted their schema and then done matching of schema by identifying the correspondence of similar elements through corpus-based technique. After schema matching, we populated data of HTML pages through joining related tables in one HTML table, which is more appropriate and helpful for users.
机译:表格数据是网络上现有的信息源。我们已经开始着手收集从网络上获取的HTML表格。首先将确定高质量的表,然后进行模式匹配。模式匹配标识对应的数量,该数量确定了来自两个不同模式的相似元素。列和数据值一个接一个地进行比较以匹配架构。在网络搜索引擎上搜索表格数据时,可能会返回URL而不是返回表格数据,这是主要问题。因此,我们正在研究此问题,我们提取了表格网页的数据并提取了它们的架构,然后通过基于语料库的技术通过识别相似元素的对应关系来完成架构的匹配。进行模式匹配后,我们通过将一个相关的表连接到一个HTML表中来填充HTML页面的数据,这对用户更合适和有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号