首页> 外文会议>International Conference on Sensors, Mechatronics and Automation >A Type of Web Content Extraction Algorithm Based on Adaptive Threshold
【24h】

A Type of Web Content Extraction Algorithm Based on Adaptive Threshold

机译:一种基于自适应阈值的Web内容提取算法

获取原文

摘要

On the basis of the text extraction based on the density of text, the Web page text extraction algorithm based on the adaptive threshold was proposed and applied in the new rural community employment information service system for the employment information fetching from the related government affairs website combined with the Otsu threshold algorithm. Through the web page text extraction contrast experiments to the Webpages including "The ministry of human resources and social security of the People's Republic of China", "The ministry of human resources and social security hall of henan province" and "Sina.com", the text extraction rate of the algorithm reached 90%, 92% and 92% respectively. The results showed that the application of the algorithm in new rural community employment information service system could provide technical support for the directional employment information acquisition and realize accurate employment information retrieval.
机译:在基于文本的文本提取的基础上,提出了基于自适应阈值的网页文本提取算法,并应用于新的农村社区就业信息服务系统,以获得相关政府事务网站的就业信息随着OTSU阈值算法。通过网页文本提取对比实验,包括“中华人民共和国人民共和国人力资源和社会保障部”,“河南省人力资源和社会保障大厅”和“新浪网”,算法的文本提取率分别达到90%,92%和92%。结果表明,在新农村境界就业信息服务体系中的应用可以为方向性就业信息收购提供技术支持,实现准确的就业信息检索。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号