...
【24h】

Measuring complexity with multifractals in texts. Translation effects

机译:用文本中的多重形来衡量复杂性。翻译效果

获取原文
获取原文并翻译 | 示例
           

摘要

Should quality be almost a synonymous of complexity? To measure quality appears to be audacious, even very subjective. It is hereby proposed to use a multifractal approach in order to quantify quality, thus through complexity measures. A one-dimensional system is examined. It is known that (all) written texts can be one-dimensional nonlinear maps. Thus, several written texts by the same author are considered, together with their translation, into an unusual language, Esperanto, and asa baseline their corresponding shuffled versions. Different one-dimensional time series can be used: e.g. (i) one based on word lengths, (ii) the other based on word frequencies; both are used for studying, comparing and discussing the map structure. It is shown that a variety in style can be measured through the D(q) and f(α) curves characterizing multifractal objects. This allows to observe on the one hand whether natural and artificial languages significantly influence the writing and the translation, and whether one author's texts differ technically from each other. In fact, the f(α) curves of the original texts are similar to each other, but the translated text shows marked differences. However in each case, the f(α) curves are far from being parabolic, - in contrast to the shuffled texts. Moreover, the Esperanto text has more extreme values. Criteria are thereby suggested for estimating a text quality, as if it is a time series only. A model is introduced in order to substantiate the findings: it consists in considering a text as a random Cantor set resulting from a binomial cascade of long and short words with appropriate weights. In an appendix, a connection is given with an analysis of turbulence by statistics based on Tsallis generalized entropy. In a second appendix, another view of text (language) complexity is outlined within the copying mistake map concept.
机译:质量应该几乎是复杂性的代名词吗?衡量质量似乎是大胆的,甚至是非常主观的。因此提出了使用多重分形方法以便通过复杂性度量来量化质量。研究一维系统。众所周知,(所有)书面文本可以是一维非线性映射。因此,同一作者的若干书面文本,连同其翻译,都被认为是一种不寻常的语言世界语,并且作为其相应的改组版本的基线。可以使用不同的一维时间序列: (i)一个基于单词长度,(ii)另一个基于单词频率;两者都用于研究,比较和讨论地图结构。结果表明,可以通过表征多重分形对象的D(q)和f(α)曲线来测量各种样式。这样一方面可以观察自然语言和人工语言是否对写作和翻译产生重大影响,以及一位作者的文本在技术上是否彼此不同。实际上,原始文本的f(α)曲线彼此相似,但是翻译后的文本显示出明显的差异。但是,在每种情况下,f(α)曲线都远非抛物线形,与经过改编的文本相反。此外,世界语文本具有更多的极端价值。因此提出了用于估计文本质量的标准,就好像它只是一个时间序列一样。为了证实研究结果而引入了一个模型:该模型包括将文本视为随机的Cantor集,该Cantor集是由具有适当权重的长短单词的二项式级联产生的。在附录中,通过基于Tsallis广义熵的统计数据对湍流进行了分析。在第二个附录中,在复制错误映射概念中概述了文本(语言)复杂性的另一种观点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号