首页> 外文期刊>International journal of web information systems >A strategy for extracting information from semi-structuredrnweb pages
【24h】

A strategy for extracting information from semi-structuredrnweb pages

机译:从半结构化网页中提取信息的策略

获取原文
获取原文并翻译 | 示例
       

摘要

Purpose - The aim of this paper is to propose a strategy for extracting information from web tables. Design/methodology/approach - The paper presents a strategy for extracting information from web tables of semi-structured web pages (WPs) by handling the issue of synonym which emerges as these WPs have been designed and created without referring to. any standards or guidelines. Findings - The paper finds that this strategy extracts information with high precision, and extracts the attributes besides the sub-attributes that describe the extracted attributes and values of the sub-attributes. Practical implications - Experiment conducted on the Nokia products domain demonstrated that the proposed strategy extracts information from web tables with high precision which is 98.98 percent. Originality/value - This paper contributes to the research on extracting information.
机译:目的-本文的目的是提出一种从Web表中提取信息的策略。设计/方法/方法-本文提出了一种策略,该方法通过处理同义词而出现,该策略是从半结构化网页(WP)的Web表中提取信息的,这些同义词是在设计和创建这些WP时不经提及而出现的。任何标准或准则。结果-本文发现该策略可以高精度地提取信息,并且除了描述提取的子属性和子属性值的子属性外,还提取属性。实际意义-在诺基亚产品领域进行的实验表明,该提议的策略可从Web表中提取信息的准确性高达98.98%。原创性/价值-本文为信息提取的研究做出了贡献。

著录项

  • 来源
    《International journal of web information systems》 |2010年第4期|p.304-318|共15页
  • 作者单位

    Department of Computer Science, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Malaysia;

    Department of Computer Science, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Malaysia;

    Department of Computer Science, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Malaysia;

    Department of Computer Science, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Malaysia;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    internet; data handling; information retrieval;

    机译:互联网;数据处理;信息检索;
  • 入库时间 2022-08-17 13:47:23

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号