Boosted Wrapper Induction

机译：提升包装纸归纳

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent work in machine learning for information extraction has focused on two distince sub-=problems: the conventional problem of filling template slots from natural language text, and the problem of wrapper induction, learning simple extraction procedures("wrappers") for highly structured text such as Web pages produced by CGI scripts. For suitably regular domains, existing wrapper induction algorithms can efficiently learn wrappers that are simple and highly accurate, but the regularity bias of these algorithms makes them unsuitable for most conventional information extraction tasks. Boosting is a technique for improving the performance of a simple machine learning algorithm by repeatedly applying it to the training set with different example weightings. We describe an algorithm that learns simple, low-voverage wrapper-like extraction patterns, which we then apply to comventional information extraction problems using boosting. The result is BWI, a trainable information extraction system with a strong precision bias and F1 performance better than state-of-the-art techniques in many domains.

机译：信息提取的机器学习中的最新工作集中在两个Subs-=问题上：从自然语言文本中填充模板插槽的传统问题，以及包装归纳，学习简单的提取程序（“包装”）的高度结构化文本如CGI脚本生产的网页。对于适当的域，现有的包装器感应算法可以有效地学习简单且高度准确的包装器，但这些算法的规律性偏差使它们不适合大多数传统信息提取任务。升压是一种用于通过多次将其应用于具有不同示例权重的训练集来提高简单机器学习算法的性能的技术。我们描述了一种学习简单，低vogerage包装的提取模式的算法，我们将应用于使用升压的议程信息提取问题。结果是BWI，一种可训练信息提取系统，具有强度偏差和F1性能，比许多域中的最先进技术更好。

著录项

来源
《National Conference on Artificial Intelligence》|2000年||共7页
会议地点
作者
Dayne Freitag; Nicholas Kushmerick;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Sources of Success for Boosted Wrapper Induction [J] . Kauchak David, Smarr Joseph, Elkan Charles Journal of machine learning research . 2004,第May期

机译：提升包装诱导的成功来源
2. Automated wrapper upgrade boosts output more than 60% [J] . Lisa McTigue Pierce Packaging Digest . 2014,第10期

机译：自动包装器升级可将输出提高60％以上
3. FLOW-WRAPPER BOOSTS APPEARANCE OF PASTA PACKS [J] . Food processing . 2012,第4期

机译：包装纸助推面食包装的外观
4. Boosted Wrapper Induction [C] . Dayne Freitag, Nicholas Kushmerick National Conference on Artificial Intelligence . 2000

机译：提升包装纸归纳
5. Scalable Detection and Extraction of Data in Lists in OCRed Text for Ontology Population Using Semi-Supervised and Unsupervised Active Wrapper Induction. [D] . Packer, Thomas L. 2014

机译：使用半监督和无监督主动包装诱导，可扩展地检测和提取OCRed文本中本体列表中的数据。
6. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters [O] . Chunlin Wang, Elliot J Lefkowitz 2004

机译：SS-Wrapper：用于在Linux群集上进行相似性搜索的包装器应用程序包
7. Boosted wrapper induction [O] . Dayne Freitag, Nicholas Kushmerick 2000

机译：提升包装诱导

Boosted Wrapper Induction

摘要

著录项

相似文献

相关主题

期刊订阅