首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Reducing Uncertainty of Schema Matching via Crowdsourcing with Accuracy Rates
【24h】

Reducing Uncertainty of Schema Matching via Crowdsourcing with Accuracy Rates

机译:通过众包以准确率降低模式匹配的不确定性

获取原文
获取原文并翻译 | 示例
       

摘要

Schema matching is a central challenge for data integration systems. Inspired by the popularity and the success of crowdsourcing platforms, we explore the use of crowdsourcing to reduce the uncertainty of schema matching. Since crowdsourcing platforms are most effective for simple questions, we assume that each Correspondence Correctness Question (CCQ) asks the crowd to decide whether a given correspondence should exist in the correct matching. Furthermore, members of a crowd may sometimes return incorrect answers with different probabilities. Accuracy rates of individual crowd workers can be attributes of CCQs as well as evaluations of individual workers. We prove that uncertainty reduction equals to entropy of answers minus entropy of crowds and show how to obtain lower and upper bounds for it. We propose frameworks and efficient algorithms to dynamically manage the CCQs to maximize the uncertainty reduction within a limited budget of questions. We develop two novel approaches, namely "Single CCQ" and "Multiple CCQ", which adaptively select, publish, and manage questions. We verify the value of our solutions with simulation and real implementation.
机译:模式匹配是数据集成系统面临的主要挑战。受众包平台的普及和成功的启发,我们探索了众包的使用,以减少模式匹配的不确定性。由于众包平台对简单问题最有效,因此我们假定每个对应正确性问题(CCQ)都会要求人群决定在正确匹配中是否应存在给定的对应关系。此外,人群中的成员有时可能会以不同的概率返回错误的答案。单个人群工人的准确率可以是CCQ的属性,也可以是单个工人的评估。我们证明不确定性降低等于答案的熵减去人群的熵,并说明如何获得其上下界。我们提出了框架和有效算法来动态管理CCQ,以在有限的问题预算内最大程度地减少不确定性。我们开发了两种新颖的方法,即“单CCQ”和“多CCQ”,它们可以自适应地选择,发布和管理问题。我们通过仿真和实际实施来验证我们解决方案的价值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号