首页> 中文期刊> 《苏州科技学院学报(自然科学版)》 >融合类别线索词的中文问题分类

融合类别线索词的中文问题分类

         

摘要

针对中文问题中分类有效信息较少这一特点,提出一种融合类别线索词集(CCWs)的自动特征组合绑定算法.该方法在提取词袋、词性和命名实体的基础上,通过对问题意图刻画更为准确的中心词、主语、疑问词以及疑问词相关成分作为问题类别线索词集.通过实验表明,利用CCWs和基本特征进行特征组合以及绑定后形成的新特征,在小规模不平衡的语料数据集上能有效提高分类器的分类性能.该组合绑定的方法在一定程度上提高了SVM分类器的精度,分别在小类和大类上达到86.77%和94.08%.%Aimed at the fact that there is only a bit of valid information in Chinese question classification, an automatic feature combination binding algorithm is proposed based on category clue words set (CCWs). After the extraction of bag-of-word, part of speech and named entity, this algorithm uses the head word, subject, interrogative word and interrogative word-related components to describe the question intention as the category clue words set. Experiments show that the new features formed by CCWs and basic features binding can effectively improve the classification performance of small scale and imbalanced corpus data sets. The combination binding method improves the accuracy of SVM classifier which comes to 86.77% and 94.08% respectively in fine classing and coarse classing.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号