首页> 外文会议>IEEE International Conference on Data Engineering >Slice Finder: Automated Data Slicing for Model Validation
【24h】

Slice Finder: Automated Data Slicing for Model Validation

机译:切片查找器:用于模型验证的自动数据切片

获取原文

摘要

As machine learning (ML) systems become democratized, it becomes increasingly important to help users easily debug their models. However, current data tools are still primitive when it comes to helping users trace model performance problems all the way to the data. We focus on the particular problem of slicing data to identify subsets of the validation data where the model performs poorly. This is an important problem in model validation because the overall model performance can fail to reflect that of the smaller subsets, and slicing allows users to analyze the model performance on a more granular-level. Unlike general techniques (e.g., clustering) that can find arbitrary slices, our goal is to find interpretable slices (which are easier to take action compared to arbitrary subsets) that are large and problematic. We propose Slice Finder, which is an interactive framework for identifying such slices using statistical techniques. Applications include diagnosing model fairness and fraud detection, where identifying slices that are interpretable to humans is crucial.
机译:由于机器学习(ML)系统成为民主化,帮助用户轻松调试其模型变得越来越重要。但是,当前数据工具仍然是最重要的,即帮助用户对数据一直进行跟踪模型性能问题。我们专注于切片数据来识别模型表现不佳的验证数据子集的特定问题。这是模型验证中的一个重要问题,因为整体模型性能可能无法反映较小的子集中的,并且切片允许用户分析更粒径的模型性能。与可以找到任意切片的通用技术(例如,聚类)不同,我们的目标是找到很大且有问题的可解释切片(比较更容易采取行动)。我们提出切片发现者,这是一种使用统计技术识别此类切片的交互式框架。应用包括诊断模型公平和欺诈检测,其中识别对人类来说是至关重要的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号