首页> 外文会议>International conference on recent advances in natural language processing >Identifying the Authors' National Variety of English in Social Media Texts
【24h】

Identifying the Authors' National Variety of English in Social Media Texts

机译:在社交媒体文本中识别作者的民族英语多样性

获取原文

摘要

In this paper, we present a study for the identification of authors' national variety of English in texts from social media. In data from Facebook and Twitter, information about the author's social profile is annotated, and the national English variety (US, UK, AUS, CAN, NNS) that each author uses is attributed. We tested four feature types: formal linguistic features, POS features, lexicon-based features related to the different varieties, and data-based features from each English variety. We used various machine learning algorithms for the classification experiments, and we implemented a feature selection process. The classification accuracy achieved, when the 31 highest ranked features were used, was up to 77.32%. The experimental results are evaluated, and the efficacy of the ranked features discussed.
机译:在本文中,我们提出了一项研究,旨在从社交媒体中识别作者的国家英语多样性。在来自Facebook和Twitter的数据中,注释了有关作者的社交资料的信息,并注明了每个作者使用的国家英语版本(美国,英国,AUS,CAN,NNS)。我们测试了四种特征类型:正式的语言特征,POS特征,与不同品种相关的基于词典的特征以及每种英语品种中基于数据的特征。我们将各种机器学习算法用于分类实验,并实施了特征选择过程。当使用排名最高的31个特征时,实现的分类精度高达77.32%。评估实验结果,并讨论排名功能的功效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号