【24h】

Why You Should Constrain Your Machine Learned Models

机译:为什么你应该约束你的机器学习模型

获取原文

摘要

Common use of machine learning is to gather what training examples one can, train a flexible model with some smoothness regularizers, test it on a held-out set of random examples, and *hope* it works well in practice. But we will show that by adding constraints, we can prepare our models better for their futures, and be more certain of their performance. Based on 8 years of experience at Google researching, designing, training, and launching hundreds of machine-learned models, I will discuss dozens of ways that we found one can constrain ML models to produce more robust, fairer, safer, more accurate models that are easier to debug and that when they fail, do so more predictably and reasonably. This talk will focus on two classes of model constraints: shape constraints, and rate constraints. The most common shape constraint is mono-tonicity, and it has long been known how to learn monotonic functions over one input using isotonic regression. We will discuss new R&D about 6 different practically useful shape constraints, and how to impose them on flexible, mulit-layer models. The second class of constraints, rate constraints, refers to constraints on a classifiers' output statistics, and is commonly used to make classifiers act responsibly for different groups. For example, we may constrain a classifier used globally to be at least 80% accurate on training examples from India or China, as well as minimizing classification errors on average. We will point listeners to Google's open-source Tensor Flow libraries to impose these constraints, and papers with more technical detail.
机译:机器学习的常见用途是收集一个培训的例子,一个可以用一些平滑的校长训练一个灵活的模型,在一个举出的一组随机例子上测试它,*希望*它在实践中运作良好。但我们将展示通过添加限制,我们可以更好地为他们的期货准备我们的模型,并更加确定他们的表现。基于8年的谷歌研究,设计,培训和推出数百台机器学习模型的经验,我将讨论我们发现一个人可以限制ML模型来生产更强大,更安全,更准确的模型更容易调试,当他们失败时,更加预测和合理地做得更好。此谈话将侧重于两类模型约束:形状约束和速率约束。最常见的形状约束是单张力,并且已经知道如何使用等间回归在一个输入上学习单调函数。我们将讨论新的R&D大约6种不同实际上有用的形状约束,以及如何在灵活的Mulit层模型上强加它们。第二类约束,速率约束,指的是对分类器的输出统计数据的约束,并且通常用于使分类器负责适用于不同的组。例如,我们可能会限制全球使用的分类器,在印度或中国的训练示例中至少80%,以及平均最小化分类错误。我们将收听者指向Google的开源张流量库,以强加这些约束,以及具有更多技术细节的文件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号