Question Selection for Crowd Entity Resolution

机译：人群实体分辨率的问题选择

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We study the problem of enhancing Entity Resolution (ER) with the help of crowdsourcing. ER is the problem of clustering records that refer to the same real-world entity and can be an extremely difficult process for computer algorithms alone. For example, figuring out which images refer to the same person can be a hard task for computers, but an easy one for humans. We study the problem of resolving records with crowdsourcing where we ask questions to humans in order to guide ER into producing accurate results. Since human work is costly, our goal is to ask as few questions as possible. We propose a probabilistic framework for ER that can be used to estimate how much ER accuracy we obtain by asking each question and select the best question with the highest expected accuracy. Computing the expected accuracy is #P-hard, so we propose approximation techniques for efficient computation. We evaluate our best question algorithms on real and synthetic datasets and demonstrate how we can obtain high ER accuracy while significantly reducing the number of questions asked to humans.

机译：我们在众包的帮助下研究了加强实体解析（ER）的问题。呃是群集记录的问题，它引用相同的真实实体，并且可以是单独的计算机算法的极其困难的过程。例如，弄清楚哪些图像参考同一个人可以是计算机的艰难任务，但对于人类来说是一个容易的人。我们研究了用众包解决记录的问题，在那里我们向人类提出问题，以指导ER产生准确的结果。由于人类的工作成本高，我们的目标是尽可能少的问题。我们向ER提出了一个概率框架，可以用于估计我们通过询问每个问题获得的易于准确性，并选择最高的预期准确性的最佳问题。计算预期的准确性是#P-HARD，因此我们提出了用于有效计算的近似技术。我们评估了真实和合成数据集的最佳问题算法，并展示了我们如何获得高的ER准确性，同时显着减少对人类的问题数量。

著录项

来源
《International conference on very large data bases》|2013年||共12页
会议地点
作者
Steven Euijong Whang; Peter Lofgren; Hector Garcia-Molina;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. Modeling Topic-Based Human Expertise for Crowd Entity Resolution [J] . Sai-Sai Gong, Wei Hu, Wei-Yi Ge, 计算机科学技术学报（英文版） . 2018,第006期

机译：为人群实体解决建模基于主题的人类专业知识
2. A High Accurate Multiple Classifier System for Entity Resolution Using Resampling and Ensemble Selection [J] . Zhou Xing, Diao Xingchun, Cao Jianjun Mathematical Problems in Engineering . 2015,第Pta20期

机译：使用重采样和集合选择的用于实体解析的高精度多分类器系统
3. A High Accurate Multiple Classifier System for Entity Resolution Using Resampling and Ensemble Selection [J] . ZhouXing, DiaoXingchun, CaoJianjun Mathematical Problems in Engineering: Theory, Methods and Applications . 2015,第5期

机译：使用重采样和集合选择的用于实体解析的高精度多分类器系统
4. Question Selection for Crowd Entity Resolution [C] . Steven Euijong Whang, Peter Lofgren, Hector Garcia-Molina International conference on very large data bases . 2013

机译：人群实体解析的问题选择
5. Design and construction of an entity resolution system that supports entity identity information management and asserted resolution. [D] . Nelson, Eric Derrand. 2011

机译：支持实体身份信息管理和断言解析的实体解析系统的设计和构建。
6. How Online Crowds Influence the Way Individual Consumers Answer Health Questions [O] . A.Y.S. Lau, T.M.Y. Kwok, E. Coiera 2011

机译：在线人群如何影响个人消费者回答健康问题的方式
7. Question selection for crowd entity resolution [O] . Steven Euijong Whang, Peter Lofgren, Hector Garcia-molina 2015

机译：人群实体解析的问题选择
8. Monitoring Entities in an Uncertain World: Entity Resolution and Referential Integrity. [R] . C. A. Knoblock K. See P. LaMonica S. A. Macskassy S. N. Minton 2011

机译：监测不确定世界中的实体：实体解决方案和参考完整性。

Question Selection for Crowd Entity Resolution

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅