...
首页> 外文期刊>Language Resources and Evaluation >The corpus of Basque simplified texts (CBST)
【24h】

The corpus of Basque simplified texts (CBST)

机译:巴斯克简化文本(CBST)的语料库

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator who considers easy-to-read guidelines and the intuitive, by a teacher based on her experience. The aim of this corpus is to make a comparative analysis of simplified text. To that end, we also present the annotation scheme we have created to annotate the corpus. The annotation scheme is divided into eight macro-operations: delete, merge, split, transformation, insert, reordering, no operation and other. These macro-operations can be classified into different operations. We also relate our work and results to other languages. This corpus will be used to corroborate the decisions taken and to improve the design of the automatic text simplification system for Basque.
机译:在本文中,我们介绍了巴斯克语简化文本的语料库。该语料库汇编了227个科学普及领域的原始句子以及每个句子的两个简化版本。简化版本是通过以下不同方法创建的:结构是由译员(他们考虑了易于理解的指导原则)和直观的,是由教师根据她的经验编写的。该语料库的目的是对简化文本进行比较分析。为此,我们还介绍了我们创建的用于注释主体的注释方案。注释方案分为八个宏操作:删除,合并,拆分,转换,插入,重新排序,无操作等。这些宏操作可以分为不同的操作。我们还将我们的工作和结果与其他语言联系起来。该语料库将用于确认所做出的决定,并改进巴斯克自动文本简化系统的设计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号