Identifying Rare Classes with Sparse Training Data

机译：用稀疏训练数据识别罕见的课程

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Building models and learning patterns from a collection of data are essential tasks for decision making and dissemination of knowledge. One of the common tools to extract knowledge is to build a classifier. However, when the training dataset is sparse, it is difficult to build an accurate classifier. This is especially true in biological science, as biological data are hard to produce and error-prone. Through empirical results, this paper shows challenges in building an accurate classifier with a sparse biological training dataset. Our findings indicate the inadequacies in well known classification techniques. Although certain clustering techniques, such as seeded k-Means, show some promise, there are still spaces for further improvement. In addition, we propose a novel idea that could be used to produce more balanced classifier when training data samples are very limited.

机译：从集合的建立模型和学习模式是决策和传播知识的必要任务。提取知识的一个常见工具是构建分类器。但是，当训练数据集稀疏时，很难构建准确的分类器。这种生物科学尤其如此，因为生物数据很难产生和容易出错。通过经验结果，本文显示了建立具有稀疏生物训练数据集的准确分类器的挑战。我们的研究结果表明了知名分类技术的不足。虽然某些聚类技术，例如种子K-means，但显示了一些承诺，但仍有进一步改进的空间。此外，我们提出了一种新的想法，可用于在训练数据样本非常有限时生产更多平衡分类器。

著录项

来源
《International Conference on Database and Expert Systems Applications》|2007年||共10页
会议地点
作者
Mingwu Zhang; Wei Jiang; Chris Clifton; Sunil Prabhakar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. A Cluster Adaptive Training Algorithm for Text-Independent Speaker Verification with Sparse Training Data [J] . Tao Jiang, Jiqing Han Advanced Science Letters . 2012,第1期

机译：具有稀疏训练数据的文本独立说话人验证的群集自适应训练算法
2. A Cluster Adaptive Training Algorithm for Text-Independent Speaker Verification with Sparse Training Data [J] . Tao Jiang, Jiqing Han Advanced Science Letters . 2012,第1期

机译：具有稀疏训练数据的文本独立说话人验证的群集自适应训练算法
3. Batch Discovery of Recurring Rare Classes toward Identifying Anomalous Samples [J] . Murat Dundar, Halid Ziya Yerebakan, Bartek Rajwa SIGKDD explorations . 2014,第CDaROM期

机译：批量发现经常出现的稀有类，以识别异常样本
4. Identifying Rare Classes with Sparse Training Data [C] . Mingwu Zhang, Wei Jiang, Chris Clifton, International Conference on Database and Expert Systems Applications . 2007

机译：用稀疏训练数据识别罕见的课程
5. Methods for Meta-Analyses of Rare Events, Sparse Data, and Heterogeneity [D] . Zabriskie, Brinley 2019

机译：稀有事件，稀疏数据和异质性的荟萃分析方法
6. Identification of low frequency and rare variants for hypertension using sparse-data methods [O] . Ji-Hyung Shin, Ruiyang Yi, Shelley B. Bull 2016

机译：使用稀疏数据方法识别高血压的低频和罕见变异
7. Identifying Rare Classes with Sparse Training Data [O] . Mingwu Zhang, Wei Jiang, Chris Clifton, 2010

机译：使用稀疏训练数据识别稀有类
8. JV Task 86 - Identifying the Source of Benzene in Indoor Air Using Different Compound Classes from TO-15 Data. [R] . Hawthorne, S. B. 2007

机译：JV任务86 - 使用TO-15数据中的不同化合物类别识别室内空气中的苯源。

Identifying Rare Classes with Sparse Training Data

摘要

著录项

相似文献

相关主题

期刊订阅