Improving Native Language Identification with TF-IDF Weighting

机译：用TF-IDF加权提高母语识别

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents a Native Language Identification (NLI) system based on TF-IDF weighting schemes and using linear classifiers - support vector machines, logistic regressions and perceptrons. The system was one of the participants of the 2013 NLI Shared Task in the closed-training track, achieving 0.814 overall accuracy for a set of 11 native languages. This accuracy was only 2.2 percentage points lower than the winner's performance. Furthermore, with subsequent evaluations using 10-fold cross-validation (as given by the organizers) on the combined training and development data, the best average accuracy obtained is 0.8455 and the features that contributed to this accuracy are the TF-IDF of the combined unigrams and bigrams of words.

机译：本文介绍了基于TF-IDF加权方案的母语识别（NLI）系统，并使用线性分类器 - 支持向量机，Logistic回归和Perceptrons。该系统是在封闭式训练轨道上的2013年NLI共享任务的参与者之一，实现了一组11个母语的0.814的总体准确性。这种准确性仅比获胜者的表现低2.2个百分点。此外，随后使用10倍交叉验证的评估（由组织者给出）在组合的培训和开发数据上，获得的最佳平均精度是0.8455，并且促成了这种准确性的特征是组合的TF-IDF Unigrams和Bigram的单词。

著录项

来源
《Workshop on Innovative Use of NLP for Building Educational Applications》|2013年||共8页
会议地点
作者
Binyam Gebrekidan Gebre; Marcos Zampieri; Peter Wittenburg; Tom Heskes;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Enhanced Tf-Idf Weighting Scheme For Plagiarism Detection Model For Arabic Language [J] . Ameer A.A. Yousef, Mohd Juzaiddin Ab Aziz Australian Journal of Basic and Applied Sciences . 2015,第2015期

机译：阿拉伯语抄袭检测模型的改进Tf-Idf加权方案
2. Text Classification Using Novel Term Weighting Scheme-Based Improved TF-IDF for Internet Media Reports [J] . Zhiying Jiang, Bo Gao, Yanlin He, Mathematical Problems in Engineering: Theory, Methods and Applications . 2021,第a期

机译：用于基于术语加权方案的文本分类，基于基于术语的改进的TF-IDF用于互联网媒体报告
3. A New Approach that improves TF-IDF Weighting Measure [J] . Reddahi Nabil, Labriji Amine, Abdelbaki Issam, International Journal of Information and Communication Technology Research . 2015,第10期

机译：一种改进TF-IDF加权度量的新方法
4. Improving Native Language Identification with TF-IDF Weighting [C] . Binyam Gebrekidan Gebre, Marcos Zampieri, Peter Wittenburg, Workshop on Innovative Use of NLP for Building Educational Applications . 2013

机译：通过TF-IDF加权改善母语识别
5. Native Language Identification Using Phonetic Algorithms [D] . Smiley, Charese H. 2018

机译：使用拼音算法的母语语言识别
6. Lexical Exposure to Native Language Dialects can Improve Non-native Phonetic Discrimination [O] . Annie J. Olmstead, Navin Viswanathan -1

机译：对母语方言的词汇接触可以改善非母语的语音歧视
7. Text Classification Using Novel Term Weighting Scheme-Based Improved TF-IDF for Internet Media Reports [O] . Zhiying Jiang, Bo Gao, Yanlin He, 2021

机译：用于基于术语加权方案的文本分类，基于基于术语的改进的TF-IDF用于互联网媒体报告
8. Improving Academic Performance Among American Indian, Alaska Native, and Native Hawaiian Students: Assessment and Identification of Learning and Learning Disabilities: Workshop Summary. Held in Santa Fe, New Mexico on March 16-18, 2005 [R] . 2005

机译：提高美国印第安人，阿拉斯加原住民和夏威夷土着学生的学业成绩：学习和学习障碍的评估和识别：研讨会摘要。 2005年3月16日至18日在新墨西哥州圣达菲举行

Improving Native Language Identification with TF-IDF Weighting

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅