首页> 外文会议>WSEAS International Conference on Applied Mathematics >Statistical Structure of Printed Turkish, English, German, French, Russian and Spanish
【24h】

Statistical Structure of Printed Turkish, English, German, French, Russian and Spanish

机译:印刷土耳其语,英语,德语,法语,俄语和西班牙语的统计结构

获取原文

摘要

Interests in the statistical properties of language, the basic tool for communication, has been frequently used for the development of computer sciences such as the construction of efficient binary codes. The language itself may be also regarded as a code for certain conceptual entities. From this point of view, in this study, statistical structures of printed Turkish, English, German, French, Russian and Spanish are examined on the basis of the probability distribution of letters for the same semantic content. Consequently, the optimal language in the sense of coding theory is determined by using Shannon's measure for entropy. During the analysis of the study, we encountered by some known difficulties about the evaluation of Shannon's measure. In order to get over these difficulties, we have established that the regression analysis is a convenient method. So, a regression equation is given for generalization of entropy estimates and related interpretations are given. The main important result of the paper is that the slope of the simple linear regression model gives the approximated value for the entropy of the languages.
机译:语言的统计特性的兴趣,通信的基本工具,经常用于开发计算机科学,例如高效二元码的构建。语言本身也可能被视为某些概念实体的代码。从这项研究中,根据相同语义内容的字母的概率分布,检查印刷土耳其语,英语,德语,法语,俄语和西班牙语的统计结构。因此,编码理论意义上的最佳语言是通过使用Shannon的熵测量来确定的。在分析研究期间,我们遇到了关于香农措施评估的一些已知困难。为了克服这些困难,我们已经确定了回归分析是一种方便的方法。因此,给出了回归方程,以便给出熵估计的概括,并且给出了相关的解释。本文的主要重要结果是简单线性回归模型的斜率为语言的熵提供了近似值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号