首页> 外文会议>IEEE Symposium on Web Society >Web information extraction based on news domain ontology theory
【24h】

Web information extraction based on news domain ontology theory

机译:基于新闻域本体论理论的网络信息提取

获取原文
获取外文期刊封面目录资料

摘要

For the current web information extraction can't adapt to the various page structures, this paper proposes a Web Information Extraction Method based on News Domain Ontology. The areas are accurately found out and the interested information was extracted exactly based on information extraction rules which is generated by news domain ontology. Using the technology of page processing, page conversion, XPath etc, the information extraction system based on news domain ontology is implemented. Testing from news site shows that the approach proposed doesn 't rely on the page structure and it can increase the recall and precision of information extraction.
机译:对于当前的Web信息提取无法适应各种页面结构,本文提出了一种基于新闻域本体的Web信息提取方法。准确地发现区域,并根据新闻域本体生成的信息提取规则完全提取感兴趣的信息。使用页面处理技术,页面转换,XPath等,基于新闻域本体的信息提取系统实现。新闻网站的测试表明,该方法提出的方法不依赖于页面结构,它可以增加信息提取的召回和精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号