In machine learning sometimes tradeoffs must be made between accuracy and intelligibility: the most accurate models usually are not very intelligible, and the most intelligible models usually are less accurate. This can limit the accuracy of models that can safely be deployed in mission-critical applications where being able to understand, validate, edit, and ultimately trust a model is important. We have been working on a learning method to escape this tradeoff that is as accurate as full complexity models such as boosted trees and random forests, but more intelligible than linear models. This makes it easy to understand what the model has learned and to edit the model when it learns inappropriate things. Making it possible for humans to understand and repair a model is critical because most training data has unexpected problems. I'll present several case studies where these high-accuracy GAMs discover surprising patterns in the data that would have made deploying a black-box model inappropriate. I'll also show how these models can be used to detect and correct bias. And if there's time, I'll briefly discuss using intelligible GAM models to predict COVID-19 mortality.
展开▼