首页> 外文会议>ACM SIGMOD international conference on management of data >On Active Learning of Record Matching Packages
【24h】

On Active Learning of Record Matching Packages

机译:论记录匹配包的积极学习

获取原文

摘要

We consider the problem of learning a record matching package (classifier) in an active learning setting. In active learning, the learning algorithm picks the set of examples to be labeled, unlike more traditional passive learning setting where a user selects the labeled examples. Active learning is important for record matching since manually identifying a suitable set of labeled examples is difficult. Previous algorithms that use active learning for record matching have serious limitations: The packages that they leam lack quality guarantees and the algorithms do not scale to large input sizes. We present new algorithms for this problem that overcome these limitations. Our algorithms are fundamentally different from traditional active learning approaches, and are designed ground up to exploit problem characteristics specific to record matching. We include a detailed experimental evaluation on real-world data demonstrating the effectiveness of our algorithms.
机译:我们考虑在活动学习设置中学习记录匹配包(分类器)的问题。在主动学习中,学习算法选择要标记的示例集,与用户选择标记示例的更传统的被动学习设置不同。主动学习对于记录匹配非常重要,因为手动识别适当的标记示例集是困难的。以前使用主动学习进行记录匹配的算法具有严重的限制:它们LeaM缺乏质量保证的软件包,算法不会扩展到大输入大小。我们为这个克服这些限制的问题呈现了新的算法。我们的算法与传统的主动学习方法根本不同,并设计为利用特定于记录匹配的问题特征。我们包括关于实际数据的详细实验评估,证明了算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号