首页> 外文期刊>Open Journal of Statistics >Deep Language Statistics of Italian throughout Seven Centuries of Literature and Empirical Connections with Miller’s 7 ∓ 2 Law and Short-Term Memory
【24h】

Deep Language Statistics of Italian throughout Seven Centuries of Literature and Empirical Connections with Miller’s 7 ∓ 2 Law and Short-Term Memory

机译:整个七个世纪文学史中的意大利深度语言统计以及与Miller的7∓的经验联系2定律和短期记忆

获取原文
           

摘要

Statistics of languages are usually calculated by counting characters, words, sentences, word rankings. Some of these random variables are also the main “ingredients” of classical readability formulae. Revisiting the readability formula of Italian, known as GULPEASE, shows that of the two terms that determine the readability index G — the semantic index , proportional to the number of characters per word, and the syntactic index G _( F ) , proportional to the reciprocal of the number of words per sentence — G _( F ) is dominant because G _( C ) is, in practice, constant for any author throughout seven centuries of Italian Literature. Each author can modulate the length of sentences more freely than he can do with the length of words, and in different ways from author to author. For any author, any couple of text variables can be modelled by a linear relationship y = mx , but with different slope m from author to author, except for the relationship between characters and words, which is unique for all. The most important relationship found in the paper is that between the short - term memory capacity, described by Miller’s “7 ? 2 law” ( i.e. , the number of “chunks” that an average person can hold in the short - term memory ranges from 5 to 9), and the word interval , a new random variable defined as the average number of words between two successive punctuation marks. The word interval can be converted into a time interval through the average reading speed. The word interval spread s in the same range as Miller’s law, and the time interval is spread in the same range of short - term memory response times. The connection between the word interval (and time interval) and short - term memory appears, at least empirically, justified and natural, however , to be further investigated. Technical and scientific writings (papers, essays , etc.) ask more to their readers because words are on the average longer, the readability index G is lower, word and time intervals are longer. Future work done on ancient languages, such as the classical Greek and Latin Literatures (or modern languages Literatures), could bring us an insight into the short - term memory required to their well-educated ancient readers.
机译:语言统计通常是通过对字符,单词,句子,单词排名进行计数来计算的。其中一些随机变量也是经典可读性公式的主要“成分”。回顾意大利语的可读性公式,称为GULPEASE,它显示了确定可读性索引 G的两个术语中的G —语义索引(与每个单词的字符数成正比)和句法索引 G _(F),与每个句子中单词数量的倒数成正比- G _(F)是主要的,因为 G _(C)实际上对于七个世纪的意大利文学中的任何作家。每个作者可以比单词长度更自由地调整句子的长度,而且作者之间的表达方式也有所不同。对于任何作者,可以通过线性关系 y = mx来建模任意几个文本变量,但是作者与作者之间的斜率 m不同,但字符和单词之间的关系除外,即对所有人来说都是独一无二的。该论文中发现的最重要的关系是短期存储容量之间的关系,如Miller的“ 7? 2定律”(,即普通人在短期记忆中可以容纳的“块”的数量为5到9)和单词间隔(定义为平均值的新随机变量)两个连续标点符号之间的字数。单词间隔可以通过平均读取速度转换为时间间隔。 单词间隔在米勒定律的同一范围内扩展,时间间隔在短期记忆响应时间的同一范围内扩展。单词间隔(和时间间隔)与短期记忆之间的联系至少在经验上看来是合理和自然的,但是有待进一步研究。技术和科学著作(论文,论文等)对读者的要求更高,因为单词平均更长,可读性指数 G更低,单词和时间间隔更长。在诸如古典希腊和拉丁文学(或现代语言文学)之类的古代语言上所做的未来工作,可能使我们对他们受过良好教育的古代读者所需要的短期记忆有所了解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号