首页> 外文会议>Workshop on Innovative Use of NLP for Building Educational Applications >Improving Native Language Identification with TF-IDF Weighting
【24h】

Improving Native Language Identification with TF-IDF Weighting

机译:用TF-IDF加权提高母语识别

获取原文
获取外文期刊封面目录资料

摘要

This paper presents a Native Language Identification (NLI) system based on TF-IDF weighting schemes and using linear classifiers - support vector machines, logistic regressions and perceptrons. The system was one of the participants of the 2013 NLI Shared Task in the closed-training track, achieving 0.814 overall accuracy for a set of 11 native languages. This accuracy was only 2.2 percentage points lower than the winner's performance. Furthermore, with subsequent evaluations using 10-fold cross-validation (as given by the organizers) on the combined training and development data, the best average accuracy obtained is 0.8455 and the features that contributed to this accuracy are the TF-IDF of the combined unigrams and bigrams of words.
机译:本文介绍了基于TF-IDF加权方案的母语识别(NLI)系统,并使用线性分类器 - 支持向量机,Logistic回归和Perceptrons。该系统是在封闭式训练轨道上的2013年NLI共享任务的参与者之一,实现了一组11个母语的0.814的总体准确性。这种准确性仅比获胜者的表现低2.2个百分点。此外,随后使用10倍交叉验证的评估(由组织者给出)在组合的培训和开发数据上,获得的最佳平均精度是0.8455,并且促成了这种准确性的特征是组合的TF-IDF Unigrams和Bigram的单词。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号