Summarising Historical Text in Modern Languages

机译：在现代语言中概述历史文本

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce the task of historical text summarisation, where documents in historical forms of a language are summarised in the corresponding modern language. This is a fundamentally important routine to historians and digital humanities researchers but has never been automated. We compile a high-quality gold-standard text summarisation dataset, which consists of historical German and Chinese news from hundreds of years ago summarised in modern German or Chinese. Based on cross-lingual transfer learning techniques, we propose a summarisation model that can be trained even with no cross-lingual (historical to modern) parallel data, and further benchmark it against state-of-the-art algorithms. We report automatic and human evaluations that distinguish the historic to modern language summarisation task from standard cross-lingual summarisation (i.e., modern to modern language), highlight the distinctness and value of our dataset. and demonstrate that our transfer learning approach outperforms standard cross-lingual benchmarks on this task.

机译：我们介绍了历史文本汇总的任务，其中语言的历史形式的文件总结在相应的现代语言中。这是历史学家和数字人文研究人员的根本重要的例程，但从未自动化。我们编制了一个高质量的金标准文本汇总数据集，其中包括来自数百年前的历史德国和中国新闻，总结在现代德国或中国人。基于交叉传输学习技术，我们提出了一个可以训练的概要模型，即使没有跨语言（历史到现代）并行数据，还可以进一步基准，而不是最先进的算法。我们报告了自动和人性化评估，将历史性与现代语言汇总任务区分开于标准的交叉术语（即现代语言），突出了我们数据集的明显和价值。并证明我们的转移学习方法在这项任务上占此标准的交叉基准。

著录项

来源
《Conference of the European Chapter of the Association for Computational Linguistics》|2021年|3123-3142|共20页
会议地点
作者
Xutan Peng; Yi Zheng; Chenghua Lin; Advaith Siddharthan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Hybrid methodologies for summarisation of Kannada language text documents [J] . R. Jayashree, K. Srikanta Murthy, Basavaraj S. Anami International Journal of Knowledge Engineering and Data Mining . 2014,第1期

机译：卡纳达语文本文档汇总的混合方法
2. A classification-based summarisation model for summarising text documents [J] . M. Esther Hannah, Saswati Mukherjee International Journal of Information and Communication Technology . 2014,第3a4期

机译：用于分类文本文档的基于分类的摘要模型
3. MODERN STATISTICAL AND LINGUISTIC APPROACHES TO PROCESSING TEXTS IN NATURAL LANGUAGES [J] . ALEKSANDR EVGENJEVICH PETROV, DMITRII ALEKSANDROVICH SYTNIK Journal of Theoretical and Applied Information Technology . 2016,第2期

机译：自然语言处理文本的现代统计和语言方法
4. Improving Part-of-Speech Tagging of Historical Text by First Translating to Modern Text [C] . Erik Tjong Kim Computational history and data-driven humanities . 2016

机译：通过首先翻译成现代文本来改进历史文本的词性标记
5. A TEST OF THE COMPARATIVE METHOD (A HISTORICALLY CONTROLLED RECONSTRUCTION BASED ON FOUR MODERN INDIC LANGUAGES). [D] . SOUTHWORTH, FRANKLIN CHESTER. 1958

机译：比较方法的测试（基于四种现代印度语言的历史控制的重构）。
6. Harnessing Biomedical Natural Language Processing Tools to Identify Medicinal Plant Knowledge from Historical Texts [O] . Vivekanand Sharma, Wayne Law, Michael J. Balick, 2017

机译：利用生物医学自然语言处理工具从历史文本中识别药用植物知识
7. Linguistic, historical continuity: onyms in the modern language (based on historical monuments and modern language) [O] . С.Қ Иманбердиева, Н.Ж Егізбаева 2020

机译：语言，历史连续性：现代语言的Onyms（基于历史古迹和现代语言）

Summarising Historical Text in Modern Languages

摘要

著录项

相似文献

相关主题

期刊订阅