首页> 外文期刊>Pattern recognition letters >Question classification based on co-training style semi-supervised learning
【24h】

Question classification based on co-training style semi-supervised learning

机译:基于协同训练风格半监督学习的问题分类

获取原文
获取原文并翻译 | 示例
           

摘要

In statistical question classification, semi-supervised learning that can exploit the abundant unlabeled samples has received substantial attention in recent years. In this paper, a novel question classification approach with the co-training style semi-supervised learning is proposed. In particular, the method extracts high-frequency keywords as classification features, and uses the word semantic similarity to adjust the feature weights. The classifiers are initially trained from labeled data and then the learned models are refined using unlabeled data which can get labeled if the classifiers agree on the labeling. Experiments on the Chinese question answering system in tourism domain were conducted by employing different feature selections, different supervised and semi-supervised algorithms, different feature dimensions and different unlabeled rates. The experimental results show the proposed method can effectively improve the classification accuracy. Specifically, under the 40% unlabeled rate of training set, the average accuracy rates reach 88.9% on coarse types and 78.2% on fine types, respectively, which get an improvement of around 2-4% points.
机译:在统计问题分类中,近年来,可以利用大量未标记样本的半监督学习受到了广泛关注。本文提出了一种具有协同训练风格的半监督学习的新问题分类方法。特别地,该方法提取高频关键词作为分类特征,并使用单词语义相似度来调整特征权重。最初从标签数据中训练分类器,然后使用未标记数据完善学习的模型,如果分类器同意标签,则可以将其标记。通过采用不同的特征选择,不同的监督和半监督算法,不同的特征维数和不同的未标注率,对旅游领域的汉语问答系统进行了实验。实验结果表明,该方法可以有效提高分类精度。具体而言,在40%的未标记训练集率下,粗略类型的平均准确率达到88.9%,精细类型的平均准确率分别达到78.2%,提高了2-4%。

著录项

  • 来源
    《Pattern recognition letters》 |2010年第13期|P.1975-1980|共6页
  • 作者单位

    The School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650051, China The Institute of Intelligent Information Processing, Computer Technology Application, Key Laboratory of Yunnan Province, Kunming 650051, China;

    rnDepartment of Software, Yunnan University, Kunming 650091, China;

    rnThe School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650051, China;

    rnThe School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650051, China;

    rnThe School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650051, China;

    rnThe School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650051, China The Institute of Intelligent Information Processing, Computer Technology Application, Key Laboratory of Yunnan Province, Kunming 650051, China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    chinese question classification; word semantic similarity; semi-supervised learning; co-training;

    机译:中文问题分类;词的语义相似度;半监督学习;共同训练;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号