Sequencing, Combining and Sampling Classifiers to Help Find Needles in Haystacks

机译：测序，组合和采样分类器，帮助在干草堆中找到针头

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many binary prediction situations involve imbalanced datasets where the ratio of the minority class over the majority class is very low. This is especially true when dealing with problems looking to use machine learning to better detect fraud, errors or exceptions. In this paper, we address the problem of extreme imbalance, i.e. where the imbalance ratio of majority over minority instances exceeds 500. Given the scarcity of minority examples, oversampling is not sensible due to expensive computational cost. Hence, we explore and expand undersampling approaches. Specifically, we propose a modeling framework (i.e., sequence of modeling steps) that seeks to leverage as much training data as possible. Our results indicate the better trade-off between the false positives and false negatives, which makes it more suitable for real-life application.

机译：许多二进制预测情况涉及不平衡的数据集，其中少数类别对多数类的比率非常低。在处理寻求使用机器学习的问题时尤其如此，以更好地检测欺诈，错误或例外。在本文中，我们解决了极端不平衡的问题，即大多数少数群体情况超过500的问题。鉴于少数群体实例的稀缺，由于昂贵的计算成本，过采样是不明智的。因此，我们探索并扩大欠采样方法。具体地，我们提出了一种建模框架（即，建模步骤的序列），其寻求尽可能多地利用培训数据。我们的结果表明了假阳性和假阴性之间的更好的权衡，这使得它更适合现实生活。

著录项

来源
《European Conference on Artificial Intelligence;Conference on Prestigious Applications of Intelligent Systems》|2020年|769-1530p|共7页
会议地点
作者
Jaebeen Lee; Lea A. Deleris;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. FINDING A NEEDLE IN A HAYSTACK: NEW APPROACHES TO IDENTIFY DISEASE-CAUSING MUTATIONS IN PATIENTS' HIGH-THROUGHPUT SEQUENCING DATA [J] . Itan Yuval, Shang Lei, Zhang Shen-Ying, Journal of Clinical Immunology . 2016,第3期

机译：在针锋相对的情况下寻找针头：识别患者高通量测序数据中病因突变的新方法
2. Capturing needles in haystacks: a comparison of B-cell receptor sequencing methods [J] . Rachael JM Bashford-Rogers, Anne L Palser, Saad F Idris, BMC Immunology . 2014,第1期

机译：捕获大海捞针：B细胞受体测序方法的比较
3. Whole exome sequencing in short stature: Finding needles in the haystack [J] . ChernausekS.D. Hormone research in p?diatrics . 2014,第1期

机译：身材矮小的整个外显子组测序：在大海捞针中寻找针头
4. Sequencing, Combining and Sampling Classifiers to Help Find Needles in Haystacks [C] . Jaebeen Lee, Lea A. Deleris European Conference on Artificial Intelligence;Conference on Prestigious Applications of Intelligent Systems . 2020

机译：测序，组合和采样分类器，帮助在干草堆中找到针头
5. Searching for Needles in the Cosmic Haystack [D] . Devine, Thomas Ryan. 2020

机译：在宇宙干草堆中寻找针
6. Capturing needles in haystacks: a comparison of B-cell receptor sequencing methods [O] . Rachael JM Bashford-Rogers, Anne L Palser, Saad F Idris, 2014

机译：捕获大海捞针：B细胞受体测序方法的比较
7. Needles, Haystacks, and Next-Generation Genetic Sequencing [O] . Teneille R. Brown 2018

机译：针，干草堆和下一代遗传测序

Sequencing, Combining and Sampling Classifiers to Help Find Needles in Haystacks

摘要

著录项

相似文献

相关主题

期刊订阅