Vanilla Classifiers for Distinguishing between Similar Languages

机译：区分相似语言的香草分类器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we describe the submission of the UniBuc-NLP team for the Discriminating between Similar Languages Shared Task, DSL 2016. We present and analyze the results we obtained in the closed track of sub-task 1 (Similar languages and language varieties) and sub-task 2 (Arabic dialects). For sub-task 1 we used a logistic regression classifier with tf-idf feature weighting and for sub-task 2 a character-based string kernel with an SVM classifier. Our results show that good accuracy scores can be obtained with limited feature and model engineering. While certain limitations are to be acknowledged, our approach worked surprisingly well for out-of-domain, social media data, with 0.898 accuracy (3~(rd) place) for dataset B1 and 0.838 accuracy (4~(th) place) for dataset B2.

机译：在本文中，我们描述了UniBuc-NLP小组提交的关于区分相似语言共享任务DSL 2016的意见。我们介绍并分析了在子任务1（相似语言和语言变体）和子任务2（阿拉伯语）。对于子任务1，我们使用具有tf-idf特征权重的逻辑回归分类器，对于子任务2，我们使用具有SVM分类器的基于字符的字符串内核。我们的结果表明，只有有限的功能和模型工程才能获得良好的准确性得分。尽管需要承认某些局限性，但我们的方法对于域外社交媒体数据的效果出奇地好，数据集B1的精度为0.898（第3名），数据集的精度为0.838（第4名）。数据集B2。

著录项

来源
《Workshop on NLP for similar languages, varieties and dialects》|2016年|235-242|共8页
会议地点
作者
Alina Maria Ciobanu; Sergiu Nisioi; Liviu P. Dinu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A Statistical Explanation of the Distribution of Sortal Classifiers in Languages of the World via Computational Classifiers [J] . Her One-Soon, Tang Marc Mathematical research letters: MRL . 2020,第2期

机译：通过计算分类器对世界语言分类的分布分配统计解释
2. Assessment of language impairment in bilingual children using semantic tasks: two languages classify better than one [J] . Pena Elizabeth D., Bedore Lisa M., Kester Ellen S. International journal of language & communication disorders . 2016,第2期

机译：使用语义任务评估双语儿童的语言障碍：两种语言的分类效果优于一种
3. How to Distinguish Languages and Dialects [J] . S?ren Wichmann Computational linguistics . 2020,第4期

机译：如何区分语言和方言
4. Vanilla Classifiers for Distinguishing between Similar Languages [C] . Alina Maria Ciobanu, Sergiu Nisioi, Liviu P. Dinu Workshop on NLP for similar languages, varieties and dialects . 2016

机译：vanilla分类器，用于区分类似语言
5. Number in Classifier Languages. [D] . Nomoto, Hiroki. 2013

机译：分类器语言中的数字。
6. Assessment of language impairment in bilingual children using semantic tasks: two languages classify better than one [O] . Elizabeth D. Peña, Lisa M. Bedore, Ellen S. Kester -1

机译：使用语义任务评估双语儿童的语言障碍：两种语言的分类效果优于一种
7. L2 Acquisition of Mandarin Classifiers: How Distinct are Classifier-Language Learners from Non-Classifier Language Learners? [O] . Tsang WL 2012

机译：L2语言分类器的获取：非分类语言学习者的分类器 - 语言学习者有多么独特？

Vanilla Classifiers for Distinguishing between Similar Languages

摘要

著录项

相似文献

相关主题

期刊订阅