首页> 外文OA文献 >Using microtasks to crowdsource DBpedia entity classification: A study in workflow design

【2h】

Using microtasks to crowdsource DBpedia entity classification: A study in workflow design

机译：使用微任务来众包DBpedia实体分类：工作流设计研究

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

DBpedia is at the core of the Linked Open Data Cloud and widely used in research and applications. However, it is far from being perfect. Its content suffers from many flaws, as a result of factual errors inherited from Wikipedia or incomplete mappings from Wikipedia infobox to DBpedia ontology. In this work we focus on one class of such problems, un-typed entities. We propose a hierarchical tree-based approach to categorize DBpedia entities according to the DBpedia ontology using human computation and paid microtasks. We analyse the main dimensions of the crowdsourcing exercise in depth in order to come up with suggestions for workflow design and study three different workflows with automatic and hybrid prediction mechanisms to select possible candidates for the most specific category from the DBpedia ontology. To test our approach, we run experiments on CrowdFlower using a gold standard dataset of 120 previously unclassified entities. In our studies human-computation driven approaches generally achieved higher precision at lower cost when compared to workflows with automatic predictors. However, each of the tested workflows has its merit and none of them seems to perform exceptionally well on the entities that the DBpedia Extraction Framework fails to classify. We discuss these findings and their potential implications for the design of effective crowdsourced entity classification in DBpedia and beyond.

机译：DBpedia是链接开放数据云的核心，并广泛用于研究和应用程序中。但是，它远非完美。由于从Wikipedia继承的事实错误或从Wikipedia信息框到DBpedia本体的不完整映射，其内容遭受了许多缺陷。在这项工作中，我们专注于这类问题的一类，即未分类的实体。我们提出了一种基于树的分层方法，根据DBpedia本体使用人工计算和付费微任务对DBpedia实体进行分类。我们深入分析了众包活动的主要方面，以便为工作流程设计提出建议，并研究具有自动和混合预测机制的三种不同的工作流程，以便从DBpedia本体中为最具体的类别选择可能的候选人。为了测试我们的方法，我们使用包含120个先前未分类实体的黄金标准数据集在CrowdFlower上进行了实验。在我们的研究中，与具有自动预测器的工作流相比，人为计算驱动的方法通常以较低的成本实现了更高的精度。但是，每个经过测试的工作流程都有其优点，并且似乎没有一个在DBpedia Extraction Framework无法分类的实体上表现异常出色。我们讨论了这些发现及其对DBpedia及更高版本中有效的众包实体分类设计的潜在影响。

著录项

作者
Bu Qiong; Simperl Elena; Zerr Sergej; Li Yunjia;
展开▼
作者单位

展开▼
年度 100
总页数
原文格式 PDF
正文语种 en
中图分类

相似文献

外文文献
中文文献
专利

1. DBpedia-Entity v2: A Test Collection for Entity Search [J] . Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, ACM SIGIR FORUM . 2017 ,第cd期

机译：DBpedia-Entity v2：实体搜索的测试集合
2. Disambiguating the Twitter Stream Entities and Enhancing the Search Operation Using DBpedia Ontology: Named Entity Disambiguation for Twitter Streams [J] . N. Senthil Kumar, Dinakaran Muruganantham International journal of information technology and web engineering . 2016 ,第2期

机译：使用DBpedia本体消除Twitter流实体的歧义并增强搜索操作：Twitter流的命名实体歧义
3. A Comparison of Word Embeddings and N-gram Models for DBpedia Type and Invalid Entity Detection ? [J] . Hanqing Zhou, Amal Zouaq, Diana Inkpen Information . 2019 ,第1期

机译：DBpedia类型和无效实体检测的词嵌入和N-gram模型的比较
4. A Brief Perspective on Microtask Crowdsourcing Workflows for Interface Design [C] . Zhao Mengyao, Hoek Andre van der International Workshop on CrowdSourcing in Software Engineering . 2015

机译：界面设计微任务众包工作流程的简要透视
5. Modeling and design of entity identity information in entity resolution systems. [D] . Zhou, Yinle. 2012

机译：实体解析系统中实体身份信息的建模和设计。
6. Crowdsourced dataset to study the generation and impact of text highlighting in classification tasks [O] . Jorge Ramírez, Marcos Baez, Fabio Casati, 2019

机译：众包数据集以研究分类任务中文本突出显示的生成和影响
7. Figure 8: Finding semantically related entities in the DBpedia ontology: The Linked_data and Controlled_vocabulary entities in the DBpedia knowledge base are assumed to be semantically related to each other, since they are both contained under the same category, i.e., Semantic_Web. [O] . -1

机译：图8：在DBPedia本体中查找语义相关实体：DBPedia知识库中的Linked_data和Scround_vocabulary实体被假定在语义上彼此进行语义相关，因为它们都包含在同一类别下，即，Semantic_Web。

Using microtasks to crowdsource DBpedia entity classification: A study in workflow design

摘要

著录项

相似文献

相关主题

期刊订阅