首页> 外文期刊>digital scholarship in the humanities >A small set of stylometric features differentiates Latin prose and verse
【24h】

A small set of stylometric features differentiates Latin prose and verse

机译:一小组文体学特征区分了拉丁散文和诗歌

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Identifying the stylistic signatures characteristic of different genres is of central importance to literary theory and criticism. In this article we report a large-scale computational analysis of Latin prose and verse using a combination of quantitative stylistics and supervised machine learning. We train a set of classifiers to differentiate prose and poetry with high accuracy (>97) based on a set of twenty-six text-based, primarily syntactic features and rank the relative importance of these features to identify a low-dimensional set still sufficient to achieve excellent classifier performance. This analysis demonstrates that Latin prose and verse can be classified effectively using just three top features. From examination of the highly ranked features, we observe that measures of the hypotactic style favored in Latin prose (i.e. subordinating constructions in complex sentences, such as relative clauses) are especially useful for classification.
机译:识别不同体裁的文体特征对文学理论和批评至关重要。在本文中,我们报告了使用定量文体学和监督机器学习相结合的拉丁语散文和诗歌的大规模计算分析。我们训练一组分类器,以基于一组基于文本的 26 个主要句法特征,以高准确率 (>97%) 区分散文和诗歌,并对这些特征的相对重要性进行排序,以识别一个仍然足以实现出色分类器性能的低维集合。该分析表明,仅使用三个主要特征就可以有效地对拉丁散文和诗歌进行分类。通过对高排名特征的检查,我们观察到拉丁散文中偏爱的虚调风格的度量(即复杂句子中的从属结构,例如关系从句)对于分类特别有用。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号