首页> 外文会议>2016 IEEE International Conference on Big Data Analysis >What is your Mother Tongue?: Improving Chinese native language identification by cleaning noisy data and adopting BM25

【24h】

What is your Mother Tongue?: Improving Chinese native language identification by cleaning noisy data and adopting BM25

机译：您的母语是什么？：通过清除嘈杂的数据并采用BM25来改善中文母语的识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Native language identification (NLI) is a process by which an author's native language can be identified from essays written in the second language of the author. In this work, a supervised model is built to accomplish this based on a Chinese learner corpus. In the NLI field, this is the first work to (1) eliminate noisy data automatically before the training phase and (2) employ a BM25 term weighting technique to score each feature. We also adopt a hierarchical structure of linear support vector machine classifiers to achieve high accuracy and a state-of-the-art accuracy of 77.1%, which is greater than those of other Chinese NLI methods by over 10%.

机译：母语识别（NLI）是一个过程，通过该过程可以从用第二种语言撰写的论文中识别出作者的母语。在这项工作中，基于中国学习者语料库构建了一个监督模型来完成此任务。在NLI领域，这是第一项工作（1）在训练阶段之前自动消除噪声数据，（2）采用BM25项加权技术对每个特征进行评分。我们还采用了线性支持向量机分类器的分层结构，以实现高精度和77.1％的最新精度，这比其他中国NLI方法要高出10％以上。

著录项

来源
《2016 IEEE International Conference on Big Data Analysis 》|2016年|1-6|共6页
会议地点 Hangzhou(CN)
作者
Lan Wang; Masahiro Tanaka; Hayato Yamana;
展开▼
作者单位

Waseda University, Tokyo, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
author profiling; machine learning; text mining;

机译：作者概要分析;机器学习;文本挖掘;

相似文献

外文文献
中文文献
专利

1. Native American Languages As Heritage Mother Tongues [J] . Teresa L, McCarty Language, culture and curriculum . 2008 ,第3期

机译：美洲原住民语言作为传统母语
2. Relevance of Mother Tongue National Examinations to Mother Tongue Curriculum (Syllabus) and Textbooks: The Case of Wolaita Language in Wolaita Zone, Ethiopia [J] . Markos Mathewos Alaro, Abraham Kebede Advances in Sciences and Humanities . 2020 ,第1期

机译：母语国家考试对母语课程（教学大纲）和教科书的相关性：埃塞俄比亚Wolaita区Wolaita语言的案例
3. The Problematic Interaction between the Mother Tongues, the National Language and Foreign Language Instruction in Turkish Education [J] . Davut Peaci (William S. Peachy) Procedia - Social and Behavioral Sciences . 2016 ,第1期

机译：土耳其教育中母语，国家语言和外语教学之间的问题性互动
4. What is your Mother Tongue?: Improving Chinese native language identification by cleaning noisy data and adopting BM25 [C] . Lan Wang, Masahiro Tanaka, Hayato Yamana IEEE International Conference on Big Data Analysis . 2016

机译：你的母语是什么？：通过清洁嘈杂的数据和采用BM25来改善中国母语识别
5. Mother tongue/father tongue: Gender-linked differences in language use and their influence on the perceived authority of the preacher [D] . Ziel, Catherine Agnes 1991

机译：母语：使用性别的语言差异及其对传教士权威的影响
6. The Gender Gap in Second Language Acquisition: Gender Differences in the Acquisition of Dutch among Immigrants from 88 Countries with 49 Mother Tongues [O] . Frans W. P. van der Slik, Roeland W. N. M. van Hout, Job J. Schepens -1

机译：第二语言习得中的性别差异：来自88个国家的49个母语的移民在荷兰语习得中的性别差异
7. Improving Chinese Native Language Identification by Cleaning Noisy Data and Adopting BM25 [O] . Wang Lan 2016

机译：通过清除噪声数据和采用BM25来改善中文母语的识别

What is your Mother Tongue?: Improving Chinese native language identification by cleaning noisy data and adopting BM25

摘要

著录项

相似文献

相关主题

期刊订阅