首页> 美国卫生研究院文献>PLoS Clinical Trials >Mathematical Philology: Entropy Information in Refining Classical Texts Reconstruction and Early Philologists Anticipation of Information Theory
【2h】

Mathematical Philology: Entropy Information in Refining Classical Texts Reconstruction and Early Philologists Anticipation of Information Theory

机译:数学语言学:完善古典文本重构的熵信息和早期语言学家对信息论的期待

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Philologists reconstructing ancient texts from variously miscopied manuscripts anticipated information theorists by centuries in conceptualizing information in terms of probability. An example is the editorial principle difficilior lectio potior (DLP): in choosing between otherwise acceptable alternative wordings in different manuscripts, “the more difficult reading [is] preferable.” As philologists at least as early as Erasmus observed (and as information theory's version of the second law of thermodynamics would predict), scribal errors tend to replace less frequent and hence entropically more information-rich wordings with more frequent ones. Without measurements, it has been unclear how effectively DLP has been used in the reconstruction of texts, and how effectively it could be used. We analyze a case history of acknowledged editorial excellence that mimics an experiment: the reconstruction of Lucretius's De Rerum Natura, beginning with Lachmann's landmark 1850 edition based on the two oldest manuscripts then known. Treating words as characters in a code, and taking the occurrence frequencies of words from a current, more broadly based edition, we calculate the difference in entropy information between Lachmann's 756 pairs of grammatically acceptable alternatives. His choices average 0.26±0.20 bits higher in entropy information (95% confidence interval, P = 0.005), as against the single bit that determines the outcome of a coin toss, and the average 2.16±0.10 bits (95%) of (predominantly meaningless) entropy information if the rarer word had always been chosen. As a channel width, 0.26±0.20 bits/word corresponds to a 0.790.79+0.09 −0.15 likelihood of the rarer word being the one accepted in the reference edition, which is consistent with the observed 547/756 = 0.72±0.03 (95%). Statistically informed application of DLP can recover substantial amounts of semantically meaningful entropy information from noise; hence the extension copiosior informatione lectio potior, “the reading richer in information [is] preferable.” New applications of information theory promise continued refinement in the reconstruction of culturally fundamental texts.
机译:语言学家从各种抄写不佳的手稿中重建古代文本,经过几个世纪以来,信息理论家们才以概率的方式将信息概念化。一个例子就是编辑原则艰难的选择(DLP):在不同手稿中其他可接受的替代措词之间进行选择时,“越难读是可取的”。正如语言学家至少早在伊拉斯mus(Erasmus)所观察到的(并且正如信息论对热力学第二定律的预测那样),抄写错误往往会以较少的频率替换掉,因此在信息熵上会用更多的频率替换更多的信息。如果不进行测量,则不清楚在文本重构中如何有效地使用DLP,以及如何有效地使用DLP。我们分析了模仿实验的公认的卓越编辑的案例历史:Lucretius的De Rerum Natura的重建,从以当时已知的两个最古老的手稿为基础的Lachmann具有里程碑意义的1850版开始。将单词视为代码中的字符,并从当前版本更广泛的版本中获取单词的出现频率,我们计算Lachmann的756对语法可接受的替代词之间的熵信息之差。他的选择平均使熵信息高出0.26±0.20位(95%置信区间,P = 0.005),而决定硬币掷出结果的位只有一位,而(主要是)2.16±0.10位(95%)如果总是选择了较稀有的单词,则该信息是无意义的)。作为通道宽度,0.26±0.20位/字对应于0.790.79 +0.09 -0.15的可能性,即稀有字在参考版中被接受,这与观察到的547 / 756 = 0.72±0.03(95%)。 DLP的统计信息应用程序可以从噪声中恢复大量语义有意义的熵信息。因此,扩展了潜在的信息交流性,“最好阅读信息丰富的信息。”信息论的新应用有望在重建文化基础文本方面不断完善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号