首页> 外文OA文献 >Classifying unstructed textual data using the Product Score Model: an alternative text mining algorithm
【2h】

Classifying unstructed textual data using the Product Score Model: an alternative text mining algorithm

机译:使用产品分数模型对未构造的文本数据进行分类:另一种文本挖掘算法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Unstructured textual data such as students’ essays and life narratives can provide helpful information in educational and psychological measurement, but often contain irregularities and ambiguities, which creates difficulties in analysis. Text mining techniques that seek to extract useful information from textual data sources through identifying interesting patterns are promising. This chapter describes the general procedures of text classification using text mining and presents an alternative machine learning algorithm for text classification, named the product score model (PSM). Using the bag-of-words representation (single words), we conducted a comparative study between PSM and two commonly used classification models, decision tree and naïve Bayes. An application of these three models is illustrated for real textual data. The results showed the PSM performed the most efficiently and stably in classifying text. Implications of these results for the PSM are further discussed and recommendations about its use are given
机译:非结构化的文本数据(例如学生的论文和生活叙事)可以在教育和心理测量方面提供有用的信息,但通常包含不规则性和歧义性,这给分析带来了困难。试图通过识别有趣的模式来从文本数据源中提取有用信息的文本挖掘技术很有希望。本章介绍了使用文本挖掘进行文本分类的一般步骤,并提出了另一种用于文本分类的机器学习算法,称为产品评分模型(PSM)。使用词袋表示法(单个词),我们对PSM和两个常用分类模型(决策树和朴素贝叶斯)进行了比较研究。说明了这三个模型对真实文本数据的应用。结果表明,PSM在分类文本方面最有效,最稳定。进一步讨论了这些结果对PSM的影响,并给出了有关其使用的建议

著录项

  • 作者

    He Q.; Veldkamp B.P.;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号