【24h】

A Supervised Clustering Method for Text Classification

机译:文本分类的监督聚类方法

获取原文
获取原文并翻译 | 示例

摘要

This paper describes a supervised three-tier clustering method for classifying students' essays of qualitative physics in the Why2-Atlas tutoring system. Our main purpose of categorizing text in our tutoring system is to map the students' essay statements into principles and misconceptions of physics. A simple 'bag-of-words' representation using a naieve-bayes algorithm to categorize text was unsatisfactory for our purposes of analyses as it exhibited many misclassifications because of the relatedness of the concepts themselves and its inability to handle misconceptions. Hence, we investigate the performance of the k-nearest neighborhood algorithm coupled with clusters of physics concepts on classifying students' essays. We use a three-tier tagging schemata (cluster, sub-cluster and class) for each document and found that this kind of supervised hierarchical clustering leads to a better understanding of the student's essay.
机译:本文介绍了一种有监督的三层聚类方法,用于在Why2-Atlas辅导系统中对学生的定性物理学论文进行分类。我们在补习系统中对文本进行分类的主要目的是将学生的论文陈述映射到物理学的原理和误解中。对于我们的分析目的而言,使用naieve-bayes算法对文本进行分类的简单“词袋”表示不令人满意,因为由于概念本身的相关性以及无法处理误解,它表现出许多错误分类。因此,我们研究了k近邻算法与物理概念簇在对学生论文进行分类时的性能。我们为每个文档使用三层标记架构(集群,子集群和类),并发现这种有监督的分层集群可以更好地理解学生的论文。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号