According to the web page extraction,an adaptive web data extraction method based on extraction template was proposed.The adaptive web extraction process was given.The extraction rules and the adaptive search rules were de-fined,the matching method of the web page and the extraction template was presented,and the process of target data search and extraction template adaptive repair was described in details.Experimental results showed that the recall rate and preci-sion rate were more than 95%,and the method can effectively reduce the quantity of extraction templates.%针对 Web页面数据抽取问题,提出了一种基于抽取模板的自适应 Web 页面数据抽取方法。给出了自适应web数据抽取的整体流程,详细介绍了抽取模板中抽取规则和自适应搜索规则的定义方式,web 页面与抽取模板的匹配方法,以及抽取路径失效后目标数据的搜索与抽取模板的自适应修改过程。实验结果表明,基于抽取模板的自适应 web 页面数据抽取方法的召回率和查准率都达到95%以上,方法中的自适应搜索规则有效地减少了抽取模板的制定数量。
展开▼