N-ary relation extraction for simultaneous T-Box and A-Box knowledge base augmentation

Fossati Marco; Dorigatti Emilio; Giuliano Claudio

首页> 外文期刊>Semantic web >N-ary relation extraction for simultaneous T-Box and A-Box knowledge base augmentation

【24h】

N-ary relation extraction for simultaneous T-Box and A-Box knowledge base augmentation

机译：N-ARY关于同时T字幕的关系提取和A字幕知识库增强

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one still being the free-text document. This motivates the need for intelligent Web-reading agents: hypothetically, they would skim through disparate Web sources corpora and generate meaningful structured assertions to fuel knowledge bases (KBs). Ultimately, comprehensive KBs, like WIKIDATA and DBPEDIA, play a fundamental role to cope with the issue of information overload. On account of such vision, this paper depicts the FACT EXTRACTOR, a complete natural language processing (NLP) pipeline which reads an input textual corpus and produces machine-readable statements. Each statement is supplied with a confidence score and undergoes a disambiguation step via entity linking, thus allowing the assignment of KB-compliant URIs. The system implements four research contributions: it (1) executes n-ary relation extraction by applying the frame semantics linguistic theory, as opposed to binary techniques; it (2) simultaneously populates both the T-Box and the A-Box of the target KB; it (3) relies on a single NLP layer, namely part-of-speech tagging; it (4) enables a completely supervised yet reasonably priced machine learning environment through a crowdsourcing strategy. We assess our approach by setting the target KB to DBpedia and by considering a use case of 52,000 Italian Wikipedia soccer player articles. Out of those, we yield a dataset of more than 213,000 triples with an estimated 81.27% F-1. We corroborate the evaluation via (i) a performance comparison with a baseline system, as well as (ii) an analysis of the T-Box and A-Box augmentation capabilities. The outcomes are incorporated into the Italian DBpedia chapter, can be queried through its SPARQL endpoint, and/or downloaded as standalone data dumps. The codebase is released as free software and is publicly available in the DBpedia association repository.

机译：网络已经发展成为一种以不同形式雕刻的巨大知识，主要的一个仍然是自由文本文件。这激励了对智能网络阅读代理的需求：假设，他们将浏览不同的网源Corpora并产生有意义的结构性断言，以燃料知识库（KBS）。最终，综合kBs，如wikidata和dbpedia，起到应对信息过载问题的基本作用。由于此类愿景，本文描述了事实提取器，完整的自然语言处理（NLP）管道读取输入文本语料库并产生机器可读语句。每个陈述都提供置信度分数并通过实体链接进行消歧步骤，从而允许分配符合KB的URI。该系统实现了四种研究贡献：它（1）通过应用帧语义语言理论而不是二元技术来执行N-ARY关系提取;它（2）同时填充T字幕和目标KB的A盒;它（3）依赖于单个NLP层，即代表段标记;它（4）通过众群策略，能够完全监督但价格合理的机器学习环境。我们通过将目标KB设置为DBPedia并考虑使用52,000意大利维基百科足球运动员物品的用例来评估我们的方法。在那些中，我们产生了超过213,000三人的数据集，估计有81.27％的F-1。我们通过（i）与基线系统的性能比较进行证实评估，以及（ii）对T字箱和A盒增强功能的分析。结果将其纳入意大利DBPedia章节，可以通过其SparQL端点查询，和/或下载作为独立数据转储。 CodeBase被释放为自由软件，并在DBPedia关联存储库中公开可用。

著录项

来源
《Semantic web》 |2018年第4期|共27页
作者
Fossati Marco; Dorigatti Emilio; Giuliano Claudio;
展开▼
作者单位

Fdn Bruno Kessler Data &

Knowledge Management Unit Via Sommar 18 I-38123 Trento Italy;

Univ Trento Dept Comp Sci Via Sommar 9 I-38123 Trento Italy;

Fdn Bruno Kessler Future Media Unit Via Sommar 18 I-38123 Trento Italy;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Information extraction; natural language processing; frame semantics; crowdsourcing; machine learning;

机译：信息提取;自然语言处理;帧语义;众包;机器学习;

相似文献

外文文献
中文文献
专利

1. N-ary relation extraction for simultaneous T-Box and A-Box knowledge base augmentation [J] . Fossati Marco, Dorigatti Emilio, Giuliano Claudio Semantic web . 2018,第4期

机译：N-ARY关于同时T字幕的关系提取和A字幕知识库增强
2. Bio-semantic relation extraction with attention-based external knowledge reinforcement [J] . Zhijing Li, Yuchen Lian, Xiaoyong Ma, BMC Bioinformatics . 2020,第1期

机译：生物语义与关注的外部知识加固提取
3. Dual CNN for Relation Extraction with Knowledge-Based Attention and Word Embeddings [J] . Jun Li, Guimin Huang, Jianheng Chen, Computational intelligence and neuroscience . 2019,第4期

机译：基于知识的关注和Word Embeddings的关系提取双CNN
4. Generating Expressive Correspondences:An Approach Based on User Knowledge Needs and A-Box Relation Discovery [C] . Elodie Thieblin, Ollivicr Haemmerle, Cassia Trojahn International Semantic Web Conference . 2020

机译：生成表达的对应关系：一种基于用户知识需求和A-Box关系发现的方法
5. Unsupervised Graph-Based Relation Extraction and Validation for Knowledge Base Population [D] . Yu, Dian. 2017

机译：基于无监督图的知识库人口关系提取与验证
6. Bio-semantic relation extraction with attention-based external knowledge reinforcement [O] . Zhijing Li, Yuchen Lian, Xiaoyong Ma, 2020

机译：基于关注的外部知识强化的生物语义关系提取
7. Bio-semantic relation extraction with attention-based external knowledge reinforcement [O] . Zhijing Li, Yuchen Lian, Xiaoyong Ma, 2020

机译：生物语义与关注的外部知识加固提取
8. Background Knowledge in Learning-Based Relation Extraction. [R] . Do, Q. X. 2012

机译：基于学习的关系抽取的背景知识。

N-ary relation extraction for simultaneous T-Box and A-Box knowledge base augmentation

摘要

著录项

相似文献

相关主题

期刊订阅