A Classification Method for Web Information Extraction

LI Xiang-yang; ZHANG Ya-fei; LU Jian-jiang; XU Bao-wen

首页> 外文期刊>Wuhan University Journal of Natural Sciences >A Classification Method for Web Information Extraction

【24h】

A Classification Method for Web Information Extraction

机译：Web信息提取的分类方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Web information extraction is viewed as a classification process and a competing classification method is presented to extract Web information directly through classification. Web fragments are represented with three general features and the similarities between fragments are then defined on the bases of these features. Through competitions of fragments for different slots in information templates, the method classifies fragments into slot classes and filters out noise information . Far less annotated samples are needed as compared with rule-based methods and therefore it has a strong portability. Experiments show that ihc method has good performance and is superior to DOM-based method in information extraction.

机译：Web信息的提取被视为一种分类过程，提出了一种竞争性的分类方法来直接通过分类提取Web信息。 Web片段用三个通用功能表示，然后在这些功能的基础上定义片段之间的相似性。通过对信息模板中不同时隙的片段竞争，该方法将片段分类为时隙类并过滤出噪声信息。与基于规则的方法相比，需要注释的样本少得多，因此它具有很强的可移植性。实验表明，ihc方法具有良好的性能，在信息提取方面优于基于DOM的方法。

著录项

来源
《Wuhan University Journal of Natural Sciences》 |2004年第5期|共5页
作者
LI Xiang-yang; ZHANG Ya-fei; LU Jian-jiang; XU Bao-wen;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自然科学总论;
关键词
information extraction; competing classification; feature extraction; wrapper induction;

机译：信息提取竞争分类特征提取包装器归纳;

相似文献

外文文献
中文文献
专利

1. A Classification Method for Web Information Extraction [J] . LI Xiang-yang, ZHANG Ya-fei, LU Jian-jiang, Wuhan University Journal of Natural Sciences . 2004,第5期

机译：Web信息提取的分类方法
2. Optimal feature extraction methods for classification methods and their applications to biometric recognition [J] . Yin Jun, Zeng Weiming, Wei Lai Knowledge-Based Systems . 2016,第may1期

机译：分类方法的最佳特征提取方法及其在生物识别中的应用
3. Information Classification and Extraction on Official Web Pages of Organizations [J] . Jinlin Wang, Xing Wang, Hongli Zhang, Computers, Materials & Continua . 2020,第3期

机译：关于组织官方网页的信息分类和提取
4. Weblog extraction with fuzzy classification methods [C] . International Conference on the Applications of Digital Information and Web Technologies . 2009

机译：用模糊分类方法提取博客提取
5. Classification non supervisee des donnees de hautes dimensions et extraction des connaissances dans les services Web de question-reponse. [D] . Bouguessa, Mohamed. 2009

机译：问答网络服务中的大数据无监督分类和知识提取。
6. Fusion of Higher Order Spectra and Texture Extraction Methods for Automated Stroke Severity Classification with MRI Images [O] . Oliver Faust, Joel En Wei Koh, Vicnesh Jahmunah, 2021

机译：用MRI图像融合高阶谱和纹理提取方法自动化冲程严重性分类
7. Transforming user data into user value by novel mining techniques for extraction of web content, structure and usage patterns. The Development and Evaluation of New Web Mining Methods that enhance Information Retrieval and improve the Understanding of User¿s Web Behavior in Websites and Social Blogs. [O] . Ammari Ahmad N. 2010

机译：通过新颖的挖掘技术将用户数据转化为用户价值，以提取Web内容，结构和使用模式。新的Web挖掘方法的开发和评估，该方法可增强信息检索和增进对网站和社交博客中用户Web行为的理解。

A Classification Method for Web Information Extraction

摘要

著录项

相似文献

相关主题

期刊订阅