【24h】

RulingBR: A Summarization Dataset for Legal Texts

机译:Rulingbr:法律文本的摘要数据集

获取原文

摘要

Text summarization consists in generating a shorter version of an input document, which captures its main ideas. Despite the recent developments in this area, most of the existing techniques have been tested mostly in English and Chinese, due in part to the low availability of datasets in other languages. In addition, experiments have been run mostly on collections of news articles, which could lead to some bias in the research. In this paper, we address both these limitations by creating a dataset for the summarization of legal texts in Portuguese. The dataset, called RulingBR, contains about 10K rulings from the Brazilian Federal Supreme Court. We describe how the dataset was assembled and we also report on the results of standard summarization methods which may serve as a baseline for future works.
机译:文本摘要在于生成更短版本的输入文档,捕获其主要思想。尽管该领域的最新发展,但大多数现有技术都是在英语和中文中进行测试,部分原因是其他语言的数据集的低可用性。此外,实验主要是在新闻文章的收藏中运行,这可能导致研究中的一些偏见。在本文中,我们通过创建数据集来解决这些限制,以便在葡萄牙语中汇总法律文本的总结。 DataSet称为Rulingbr,包含巴西联邦最高法院的约10K裁决。我们描述了数据集是如何组装的,我们还报告了标准摘要方法的结果,该方法可以作为未来作品的基准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号