【24h】

Recent Developments in DeReKo

机译:DeReKo的最新发展

获取原文

摘要

This paper gives an overview of recent developments in the German Reference Corpus DeReKo in terms of growth, maximising relevant corpus strata, metadata, legal issues, and its current and future research interface. Due to the recent acquisition of new licenses, DeReKo has grown by a factor of four in the first half of 2014, mostly in the area of newspaper text, ami presently contains over 24 billion word tokens. Other strata, like fictional texts, web corpora, in particular CMC texts, and spoken but conceptually written texts have also increased significantly. We report on the newly acquired corpora that led to the major increase, on the principles and strategies behind our corpus acquisition activities, and on our solutions for the emerging legal, organisational, and technical challenges.
机译:本文概述了德国参考语料库DeReKo在增长方面的最新发展,最大程度地提高了相关语料库层次,元数据,法律问题及其当前和未来的研究接口。由于最近获得了新的许可证,DeReKo在2014年上半年增长了四倍,主要在报纸文本领域,ami目前包含超过240亿个单词令牌。其他层次,例如虚构文本,网络语料库(尤其是CMC文本)以及口头但概念上书面的文本,也已显着增加。我们报告了导致大量增加的新收购语料库,语料库收购活动背后的原则和策略,以及针对新兴法律,组织和技术挑战的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号