Extracting Structured Information from Wikipedia Articles to Populate Infoboxes

机译：从Wikipedia文章中提取结构化信息以填充信息框

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values to independently extract value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes.

机译：大约每三篇Wikipedia文章都包含一个信息框-该表以属性-值形式显示有关主题的重要事实。信息框的架构，即可以为概念表达的属性，是由信息框模板定义的。通常，作者没有指定所有模板属性，从而导致信息框不完整。使用iPopulator，我们引入了一个系统，该系统通过从文章的文本中提取属性值来自动填充Wikipedia文章的信息框。与以前的工作相反，iPopulator检测并利用属性值的结构来独立提取值部分。我们已经在整个信息框模板集上测试了iPopulator，并对其有效性进行了详细分析。例如，对于1,727个不同的信息框模板属性，我们实现了91％的平均提取精度。

著录项

来源
《CIKM 10;ACM conference on information and knowledge management》|2011年|p.1661-1664|共4页
会议地点
作者
Dustin Lange; Christoph Bohm; Felix Naumann;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词
information extraction; linked data; wikipedia;

机译：信息提取;链接数据;维基百科;

相似文献

外文文献
中文文献
专利

1. Extracting complementary information from Wikipedia articles of different languages [J] . Akiyo Nadamoto, Yuya Fujiwara, Yu Suzuki, International Journal of Business Intelligence and Data Mining . 2013,第1期

机译：从不同语言的Wikipedia文章中提取补充信息
2. Path-based methods on categorical structures for conceptual representation of wikipedia articles [J] . Kucharczyk Lukasz, Szymanski Julian Journal of Intelligent Information Systems . 2017,第2期

机译：Wikipedia文章概念表示的基于路径的分类结构方法
3. Relating Wikipedia article quality to edit behavior and link structure [J] . Thorsten Ruprechter, Tiago Santos, Denis Helic Applied Network Science . 2020,第1期

机译：将维基百科文章质量与编辑行为和链接结构相关联
4. Extracting Structured Information from Wikipedia Articles to Populate Infoboxes [C] . Dustin Lange, Christoph Bohm, Felix Naumann ACM conference on information and knowledge management . 2010

机译：从维基百科文章中提取结构化信息以填充信息框
5. How Wikipedia Editors Collaborate on Article 'Talk' Pages [D] . Magnuson, Victor. 2018

机译：Wikipedia编辑如何在文章“对话”页面上进行协作
6. How the structure of Wikipedia articles influences user navigation [O] . Daniel Lamprecht, Kristina Lerman, Denis Helic, -1

机译：维基百科文章的结构如何影响用户导航
7. Extracting Imperatives from Wikipedia Article for Deletion Discussions [O] . Fiona Mao, Robert E. Mercer, Lu Xiao 2015

机译：从维基百科文章中删除必要性以删除讨论

Extracting Structured Information from Wikipedia Articles to Populate Infoboxes

摘要

著录项

相似文献

相关主题

期刊订阅