首页> 外文期刊>Computer speech and language >Learning model order from labeled and unlabeled data for partially supervised classification, with application to word sense disambiguation
【24h】

Learning model order from labeled and unlabeled data for partially supervised classification, with application to word sense disambiguation

机译:从标记和未标记的数据中学习模型顺序以进行部分监督分类,并应用于词义消歧

获取原文
获取原文并翻译 | 示例

摘要

Previous partially supervised classification methods can partition unlabeled data into positive examples and negative examples for a given class by learning from positive labeled examples and unlabeled examples, but they cannot further group the negative examples into meaningful clusters even if there are many different classes in the negative examples. Here we proposed an automatic method to obtain a natural partitioning of mixed data (labeled data + unlabeled data) by maximizing a stability criterion defined on classification results from an extended label propagation algorithm over all the possible values of model order (or the number of classes) in mixed data. Our experimental results on benchmark corpora for word sense disambiguation task indicate that this model order identification algorithm with the extended label propagation algorithm as the base classifier outperforms SVM, a one-class partially supervised classification algorithm, and the model order identification algorithm with semi-supervised κ-means clustering as the base classifier when labeled data is incomplete.
机译:先前的部分监督分类方法可以通过从正标记的示例和未标记的示例中学习,将未标记的数据分为给定类别的正样本和负样本,但是即使负样本中有许多不同的类别,它们也无法将负样本进一步分组为有意义的簇例子。在这里,我们提出了一种自动方法,该方法通过最大化在扩展的标签传播算法的分类结果中定义的稳定性标准(在模型顺序的所有可能值(或类数)上)来获得混合数据(标记数据+未标记数据)的自然分区)中的数据。我们在基准语料库上进行词义消歧任务的实验结果表明,以扩展标签传播算法为基础分类器的模型顺序识别算法的性能优于SVM,一类部分监督的分类算法和半监督的模型顺序识别算法当标记的数据不完整时,κ-均值聚类作为基础分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号