首页> 外文期刊>Open Journal of Social Sciences >A Stylometric Investigation of Linguistic Styles Based on a Vietnamese Corpus
【24h】

A Stylometric Investigation of Linguistic Styles Based on a Vietnamese Corpus

机译:基于越南语料库的语言风格训练力调查

获取原文
       

摘要

The role of stylometric methods in linguistics ha s received increased attention across a number of disciplines in recent years, particularly in forensic linguistics. This study assesses the value of correspondence analysis, a stylometric method, in Vietnamese text analysis. Based on a dataset extracted from VVC (VnExpress Viewpoint Corpus), a 1.3-million-token corpus of Vi etnamese opinion articles, linguistic features examined are seven parts-of-speech features to seek relational features characterizing authorial styles. Our focus in the analysis is on feature effects, with the aim to shed light on whether linguistic features of writing styles are consistent across various genders and professions. Seven features altogether produce encouraging results to what is acknowledged to be a difficult problem for Vietnamese language. In addition, we find that when using correspondence analysis for seven linguistic features in the dataset based on authors’ gender, conjunctions and verbs perform best. Regarding authors’ profession, conjunctions and pronouns offer a striking improvement on stylometric investigation. The discriminating ability was particularly impressive, suggesting that, in a collective sense, parts-of-speech features provide a good set of markers.
机译:仪表测量方法在语言学中的作用在近年来,在近年来,特别是在法医语言学中受到了许多学科的增加。本研究评估了越南文本分析中的对应分析,仪表方法的价值。基于从VVC(VNExpress ViewPoint语料库)中提取的数据集,vi etnamese舆论文章的1.3亿令牌语料库,所检查的语言特征是七个词段,以寻求特性特征的特征是特征的权威特征。我们在分析中的重点是在特征效果上,目的是揭示写作款式的语言特征是否符合各种性别和专业。七种功能完全产生了令人鼓舞的结果,因为越南语言是一个难题。此外,我们发现,当在数据集中使用作者的性别,连词和动词时,在数据集中使用七种语言特征时,请致敏性。关于提交人的职业,连词和代词就造型调查提供了引人注目的改善。歧视能力尤其令人印象深刻,表明,在集体意义上,言语零件的特征提供了很好的标记。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号