【24h】

Analysis of EU Languages Through Text Compression

机译:通过文本压缩分析欧盟语言

获取原文
获取原文并翻译 | 示例

摘要

In this article, we are studying the differences between the European languages using statistical and unsupervised methods. The analysis is conducted in different levels of language, lexical, morphological and syntactic. Our premise is that the difficulty of the translation could be perceived as differences or similarities in different levels of language. The results are compared to linguistic groupings. The analyses of this paper are based on the concept of Kolmo-gorov complexity, which is used to compare the language structure in syntactic and morphological levels. The way the languages convey information in these levels is taken as a measure of similarity or dissimilarity between languages and the results are compared to classical linguistic classification. The results will serve as a tool in developing machine translation system(s), e.g., in the following way: if source language conveys more information in the morphological level and the target language more in the syntactic level, it is clear that the (machine) translator must be able to transfer the information from one level to another.
机译:在本文中,我们正在使用统计方法和无监督方法研究欧洲语言之间的差异。分析是在不同级别的语言,词汇,形态和句法上进行的。我们的前提是,翻译的困难可以理解为不同语言水平的差异或相似之处。将结果与语言分组进行比较。本文的分析基于Kolmo-gorov复杂度的概念,该概念用于比较语法结构和形态学水平的语言结构。语言在这些级别上传达信息的方式被视为语言之间相似或不相似的量度,并将结果与​​经典语言分类进行比较。结果将用作开发机器翻译系统的工具,例如,通过以下方式:如果源语言在词法层次上传达了更多的信息,而目标语言在句法层面上传达了更多的信息,则很明显)翻译人员必须能够将信息从一个级别转移到另一个级别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号