首页> 外文会议>International Conference on Text, Speech and Dialogue >ParCoLab: A Parallel Corpus for Serbian, French and English
【24h】

ParCoLab: A Parallel Corpus for Serbian, French and English

机译:Parcolab:塞尔维亚,法语和英语的平行语料库

获取原文

摘要

ParCoLab is a trilingual parallel corpus containing texts in Serbian, French and English. It is developed at the CLLE-ERSS research unit (UMR 5263 CNRS) at the University of Toulouse, France, in collaboration with the Department of Romance Studies at the University of Belgrade, Serbia. Serbian being one of the less-resourced European languages, this is an important step towards the creation of freely accessible corpora and NLP tools for this language. Our main goal is to provide the scientific community with a high-quality resource that can be used in a wide range of applications, such as contrastive linguistic studies, NLP research, machine and computer assisted translation, translation studies, second language learning and teaching, and applied lexicography. The corpus currently contains 7.1M tokens mainly from literary works, but corpus extension and diversification efforts are ongoing. ParCoLab can be queried online and a part of it is available for download.
机译:Parcolab是一个三语言并行语料库,包含塞尔维亚,法语和英语的文本。它是在法国图卢兹大学的CLLE-ERSS研究单位(UMR 5263 CNRS)开发,与塞尔维亚大学浪漫研究系合作。塞尔维亚是资源较少的欧洲语言之一,这是为此语言创建自由访问的语料和NLP工具的重要一步。我们的主要目标是为科学界提供高质量的资源,可用于广泛的应用,如对比语言学研究,NLP研究,机器和计算机辅助翻译,翻译研究,第二语言学习和教学,并应用了词典。该语料库目前包含7.1米的代币,主要来自文学作品,但延长和多样化努力正在进行中。 Parcolab可以在线查询,其中一部分可供下载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号