首页> 外文会议>IEEE International Conference on Data Mining Workshops >Semi-Supervised Psychometric Scoring of Document Collections
【24h】

Semi-Supervised Psychometric Scoring of Document Collections

机译:文件收集的半监督心理计分

获取原文
获取外文期刊封面目录资料

摘要

We describe a generic computational approach that can be used in developing methods for psychometric profiling. Our approach is based on semi-supervised analysis of document collections using topic modeling. The method depends on a supervisor providing a set of seed documents, grouped by abstract themes, such as Schwartz values or personality traits; and possibly a separate background document corpus. Instead of casting the problem into a standard classification framework, we interpret the group labels as a guide for finding distinguishing features. During training, we train each group of documents associated with a theme separately by using nonnegative matrix factorization to obtain theme specific topic distributions. In the analysis, we decompose a new document using the model learned during training to arrive at the theme scores. We demonstrate our approach on two psychometric profiling theories (Schwartz and Big Five) and evaluate our Schwartz scores with leave-one-out cross-validation method and compare Big Five scores to independent surveys, which are much more costly to carry out.
机译:我们描述了一种通用的计算方法,该方法可用于开发心理测评的方法。我们的方法基于使用主题建模的文档收集的半监督分析。该方法取决于主管提供一组种子文件,这些文件按抽象主题(例如Schwartz值或人格特质)分组;可能还有一个单独的背景文件语料库。我们没有将问题放入标准分类框架中,而是将组标签解释为寻找区别特征的指南。在培训期间,我们通过使用非负矩阵分解来分别培训与主题相关的每组文档,以获得主题特定的主题分布。在分析中,我们使用在培训中学习的模型来分解新文档,以得出主题分数。我们用两种心理计量学分析理论(Schwartz和Big Five)论证了我们的方法,并使用留一法交叉验证方法评估了Schwartz得分,并将Big Five得分与独立调查进行了比较,这两项成本高得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号