首页> 外文会议>International Conference on Management Science and Intelligent Control >The Realization of Web Information Extraction Based on XML
【24h】

The Realization of Web Information Extraction Based on XML

机译:基于XML的Web信息提取实现

获取原文

摘要

The paper introduces a method of web information extraction based on XML. Firstly, it converts the data from HTML to XHTML with tidy tools, and then locates the anchor which is tied to content by path expression, at last maps extraction result to XML file with XSL. This is a method of converting unstructured data to structured data, which is possible for application program to use data of web. An example is realized about earthquake information extraction. The extraction rules are simple, robust and the codes can be widely adopted.
机译:本文介绍了一种基于XML的Web信息提取方法。首先,它将来自HTML的数据与整洁的工具转换为XHTML,然后将锚点定位为通过路径表达式绑定到内容的锚点,最后映射提取结果与XSL的XML文件。这是将非结构化数据转换为结构化数据的方法,该数据可以使用Web的数据。关于地震信息提取实现了一个例子。提取规则简单,稳健,可以广泛采用代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利