The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter

Castelli V.; Cover T.M.

首页> 外文期刊>IEEE Transactions on Information Theory >The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter

【24h】

The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter

机译：具有未知混合参数的模式识别中标记和未标记样本的相对值

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We observe a training set Q composed of l labeled samples {(X/sub 1/,/spl theta//sub 1/),...,(X/sub l/, /spl theta//sub l/)} and u unlabeled samples {X/sub 1/',...,X/sub u/'}. The labels /spl theta//sub i/ are independent random variables satisfying Pr{/spl theta//sub i/=1}=/spl eta/, Pr{/spl theta//sub i/=2}=1-/spl eta/. The labeled observations X/sub i/ are independently distributed with conditional density f/sub /spl theta/i/(/spl middot/) given /spl theta//sub i/. Let (X/sub 0/,/spl theta//sub 0/) be a new sample, independently distributed as the samples in the training set. We observe X/sub 0/ and we wish to infer the classification /spl theta//sub 0/. In this paper we first assume that the distributions f/sub 1/(/spl middot/) and f/sub 2/(/spl middot/) are given and that the mixing parameter is unknown. We show that the relative value of labeled and unlabeled samples in reducing the risk of optimal classifiers is the ratio of the Fisher informations they carry about the parameter /spl eta/. We then assume that two densities g/sub 1/(/spl middot/) and g/sub 2/(/spl middot/) are given, but we do not know whether g/sub 1/(/spl middot/)=f/sub 1/(/spl middot/) and g/sub 2/(/spl middot/)=f/sub 2/(/spl middot/) or if the opposite holds, nor do we know /spl eta/. Thus the learning problem consists of both estimating the optimum partition of the observation space and assigning the classifications to the decision regions. Here, we show that labeled samples are necessary to construct a classification rule and that they are exponentially more valuable than unlabeled samples.

机译：我们观察到由l个标记样本组成的训练集Q {（X / sub 1 /，/ spl theta // sub 1 /），...，（X / sub l /，/ spl theta // sub l /）}和u个未标记的样本{X / sub 1 /'，...，X / sub u /'}。标签/ spl theta // sub i /是满足Pr {/ spl theta // sub i / = 1} = / spl eta /，Pr {/ spl theta // sub i / = 2} = 1-的独立随机变量/ spl eta /。给定的/ spl theta // sub i /，标记的观测值X / sub i /独立分布，条件密度为f / sub / spl theta / i /（// spl middot /）。令（X / sub 0 /，/ spl theta // sub 0 /）是一个新样本，在训练集中作为样本独立分发。我们观察到X / sub 0 /，并希望推断分类/ spl theta // sub 0 /。在本文中，我们首先假定给出了f / sub 1 /（/// spl middot /）和f / sub 2 /（// spl middot /）的分布，并且混合参数未知。我们显示，在降低最佳分类器风险方面，标记和未标记样本的相对价值是它们携带的参数/ spl eta /的Fisher信息的比率。然后我们假设给出了两个密度g / sub 1 /（/ spl middot /）和g / sub 2 /（/ spl middot /），但是我们不知道g / sub 1 /（/ spl middot /）= f / sub 1 /（/ spl middot /）和g / sub 2 /（/ spl middot /）= f / sub 2 /（/ spl middot /）或如果相反成立，我们也不知道/ spl eta /。因此，学习问题既包括估计观察空间的最佳划分，又包括将分类分配给决策区域。在这里，我们表明标记的样本对于构建分类规则是必不可少的，并且它们比未标记的样本具有成倍的价值。

著录项

来源
《IEEE Transactions on Information Theory》 |1996年第6期|P.2102-2117|共16页
作者
Castelli V.; Cover T.M.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Activity recognition with android phone using mixture-of-experts co-trained with labeled and unlabeled data [J] . Young-Seol Lee, Sung-Bae Cho Neurocomputing . 2014,第feba27期

机译：使用混合有标签和未标签数据的专家混合技术在Android手机上进行活动识别
2. Sampling from Dirichlet process mixture models with unknown concentration parameter: mixing issues in large data implementations [J] . Hastie David I., Liverani Silvia, Richardson Sylvia Statistics and computing . 2015,第5期

机译：从浓度未知的Dirichlet过程混合模型中采样：大数据实现中的混合问题
3. Human Body Mixed Motion Pattern Recognition Method Based on Multi-Source Feature Parameter Fusion [J] . Nature reviews Cancer . 2020,第2期

机译：基于多源特征参数融合的人体混合运动模式识别方法
4. Classification rules in the unknown mixture parameter case: relative value of labeled and unlabeled samples [C] . Castelli, V., Cover, . 1994

机译：未知混合参数情况下的分类规则：标记和未标记样品的相对值
5. On the efficiency of ranked set sampling relative to simple random sampling for estimating the ordinary least squares parameters of the simple linear regression model. [D] . Murff, Elizabeth J Tipton. 2001

机译：关于估计简单线性回归模型的普通最小二乘法参数的排序集抽样相对于简单随机抽样的效率。
6. Sampling from Dirichlet process mixture models with unknown concentration parameter: mixing issues in large data implementations [O] . David I. Hastie, Silvia Liverani, Sylvia Richardson -1

机译：从浓度未知的Dirichlet过程混合模型中采样：大数据实现中的混合问题
7. Sampling from Dirichlet process mixture models with unknown concentration parameter: mixing issues in large data implementations [O] . David I. Hastie, Silvia Liverani, Sylvia Richardson 2014

机译：从浓度未知的Dirichlet过程混合模型中采样：大数据实现中的混合问题
8. Study of the Feasibility of Using an Advanced Opto-Electronic Imaging Technique for Sampling Mid-Water Nekton. Parameters That Govern Image Quality And Pattern Recognition Techniques For Underwater Optical Imaging. [R] . sadjian,harry 1978

机译：利用先进的光电成像技术对中水Nekton进行采样的可行性研究。用于水下光学成像的图像质量和模式识别技术参数。

The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter

摘要

著录项

相似文献

相关主题

期刊订阅