Identifying Rare Classes with Sparse Training Data

机译：用稀疏的培训数据识别稀有课程

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Building models and learning patterns from a collection of data are essential tasks for decision making and dissemination of knowledge. One of the common tools to extract knowledge is to build a classifier. However, when the training dataset is sparse, it is difficult to build an accurate classifier. This is especially true in biological science, as biological data are hard to produce and error-prone. Through empirical results, this paper shows challenges in building an accurate classifier with a sparse biological training dataset. Our findings indicate the inadequacies in well known classification techniques. Although certain clustering techniques, such as seeded k-Means, show some promise, there are still spaces for further improvement. In addition, we propose a novel idea that could be used to produce more balanced classifier when training data samples are very limited.

机译：从数据收集中建立模型和学习模式是决策和传播知识的基本任务。提取知识的常用工具之一是构建分类器。但是，当训练数据集稀疏时，很难建立准确的分类器。在生物科学中尤其如此，因为生物数据难以生成且容易出错。通过实证结果，本文显示了使用稀疏的生物学训练数据集构建准确分类器的挑战。我们的发现表明了众所周知的分类技术的不足。尽管某些聚类技术（例如种子k均值）显示出一定的前景，但仍有进一步改进的空间。另外，我们提出了一种新颖的想法，当训练数据样本非常有限时，可以用于产生更平衡的分类器。

著录项

来源
《International Conference on Database and Expert Systems Applications(DEXA 2007); 20070903-07; Regensburg(DE)》|2007年|P.751-760|共10页
会议地点 Regensburg(DE)
作者
Mingwu Zhang; Wei Jiang; Chris Clifton; Sunil Prabhakar;
展开▼
作者单位

Department of Computer Science, Purdue University West Lafayette, IN 47907-2107, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. A Cluster Adaptive Training Algorithm for Text-Independent Speaker Verification with Sparse Training Data [J] . Tao Jiang, Jiqing Han Advanced Science Letters . 2012,第1期

机译：具有稀疏训练数据的文本独立说话人验证的群集自适应训练算法
2. A Cluster Adaptive Training Algorithm for Text-Independent Speaker Verification with Sparse Training Data [J] . Tao Jiang, Jiqing Han Advanced Science Letters . 2012,第1期

机译：具有稀疏训练数据的文本独立说话人验证的群集自适应训练算法
3. Batch Discovery of Recurring Rare Classes toward Identifying Anomalous Samples [J] . Murat Dundar, Halid Ziya Yerebakan, Bartek Rajwa SIGKDD explorations . 2014,第CDaROM期

机译：批量发现经常出现的稀有类，以识别异常样本
4. Identifying Rare Classes with Sparse Training Data [C] . Mingwu Zhang, Wei Jiang, Chris Clifton, International Conference on Database and Expert Systems Applications . 2007

机译：用稀疏训练数据识别罕见的课程
5. Methods for Meta-Analyses of Rare Events, Sparse Data, and Heterogeneity [D] . Zabriskie, Brinley 2019

机译：稀有事件，稀疏数据和异质性的荟萃分析方法
6. Identification of low frequency and rare variants for hypertension using sparse-data methods [O] . Ji-Hyung Shin, Ruiyang Yi, Shelley B. Bull 2016

机译：使用稀疏数据方法识别高血压的低频和罕见变异
7. Identifying Rare Classes with Sparse Training Data [O] . Mingwu Zhang, Wei Jiang, Chris Clifton, 2010

机译：使用稀疏训练数据识别稀有类
8. JV Task 86 - Identifying the Source of Benzene in Indoor Air Using Different Compound Classes from TO-15 Data. [R] . Hawthorne, S. B. 2007

机译：JV任务86 - 使用TO-15数据中的不同化合物类别识别室内空气中的苯源。

Identifying Rare Classes with Sparse Training Data

摘要

著录项

相似文献

相关主题

期刊订阅