A Type of Web Content Extraction Algorithm Based on Adaptive Threshold

机译：一种基于自适应阈值的Web内容提取算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

On the basis of the text extraction based on the density of text, the Web page text extraction algorithm based on the adaptive threshold was proposed and applied in the new rural community employment information service system for the employment information fetching from the related government affairs website combined with the Otsu threshold algorithm. Through the web page text extraction contrast experiments to the Webpages including "The ministry of human resources and social security of the People's Republic of China", "The ministry of human resources and social security hall of henan province" and "Sina.com", the text extraction rate of the algorithm reached 90%, 92% and 92% respectively. The results showed that the application of the algorithm in new rural community employment information service system could provide technical support for the directional employment information acquisition and realize accurate employment information retrieval.

机译：在基于文本的文本提取的基础上，提出了基于自适应阈值的网页文本提取算法，并应用于新的农村社区就业信息服务系统，以获得相关政府事务网站的就业信息随着OTSU阈值算法。通过网页文本提取对比实验，包括“中华人民共和国人民共和国人力资源和社会保障部”，“河南省人力资源和社会保障大厅”和“新浪网”，算法的文本提取率分别达到90％，92％和92％。结果表明，在新农村境界就业信息服务体系中的应用可以为方向性就业信息收购提供技术支持，实现准确的就业信息检索。

著录项

来源
《International Conference on Sensors, Mechatronics and Automation》|2017年|779p|共7页
会议地点
作者
Guang Zheng; Xianghui Hui; Xin Xu; Lei Xi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP2-53;
关键词
New rural community; Web information fetching; Text density; Adaptive threshold; Otsu threshold algorithm; Web page text extraction algorithm;

机译：新农村社区;网页信息获取;文本密度;自适应阈值;OTSU阈值算法;网页文本提取算法;

相似文献

外文文献
中文文献
专利

1. An Adaptive Thresholding Algorithm-Based Optical Character Recognition System for Information Extraction in Complex Images [J] . Daniel Akinbade, Adewale Opeoluwa Ogunde, Mba Obasi Odim, Journal of computer sciences . 2020,第6期

机译：基于自适应阈值算法的复杂图像信息提取的光学字符识别系统
2. An Adaptive Thresholding Algorithm-Based Optical Character Recognition System for Information Extraction in Complex Images [J] . Daniel Akinbade, Adewale Opeoluwa Ogunde, Mba Obasi Odim, Journal of computer sciences . 2020,第6期

机译：基于自适应阈值算法的复杂图像信息提取的光学字符识别系统
3. An Attempt for Content Based Matching on Semantic Web Using Relation Map Based Algorithmic Approaches [J] . S. Raja Ranganathan, M. Marikkannan, S. Karthik Asian Journal of Information Technology . 2016,第4期

机译：使用基于关系图的算法方法在语义Web上进行基于内容的匹配的尝试
4. A Type of Web Content Extraction Algorithm Based on Adaptive Threshold [C] . Guang Zheng, Xianghui Hui, Xin Xu, International Conference on Sensors, Mechatronics and Automation . 2017

机译：一种基于自适应阈值的Web内容提取算法
5. Linkify: A Web-Based Collaborative Content Tagging System for Machine Learning Algorithms [D] . Soares, Dante. 2014

机译：Linkify：基于Web的机器学习算法协作内容标记系统
6. Fast Parallel MR Image Reconstruction via B1-based Adaptive Restart Iterative Soft Thresholding Algorithms (BARISTA) [O] . Matthew J. Muckley, Douglas C. Noll, Jeffrey A. Fessler -1

机译：通过基于B1的自适应重启迭代软阈值算法（BARISTA）进行快速并行MR图像重建
7. Two Level Key Frame Extraction for Action Recognition Using Content Based Adaptive Threshold [O] . Aditi Jahagirdar, Manoj Nagmode 2019

机译：使用基于内容的自适应阈值的动作识别的两个级键帧提取

A Type of Web Content Extraction Algorithm Based on Adaptive Threshold

摘要

著录项

相似文献

相关主题

期刊订阅