Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata

机译：学习从Wikidata生成Wikipedia摘要，以获得Wikidata的服务不足

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

While Wikipedia exists in 287 languages, its content is unevenly distributed among them. In this work, we investigate the generation of open domain Wikipedia summaries in under-served languages using structured data from Wikidata. To this end, we propose a neural network architecture equipped with copy actions that learns to generate single-sentence and comprehensible textual summaries from Wikidata triples. We demonstrate the effectiveness of the proposed approach by evaluating it against a set of baselines on two languages of different natures: Arabic, a morphological rich language with a larger vocabulary than English, and Esperanto, a constructed language known for its easy acquisition.

机译：虽然维基百科存在于287种语言中，但其内容在其中不均匀地分布。在这项工作中，我们调查了使用Wikidata的结构化数据的非服务语言的开放式维基百科摘要的产生。为此，我们提出了一种具有复制操作的神经网络架构，该架构学习从Wikidata三元组生成单句和可辨别的文本摘要。我们通过对不同性质的两种语言评估了所提出的方法的有效性：阿拉伯语，一种具有比英语更大的词汇形态丰富的语言，以及Esperanto，一种已知的构建语言，可以轻松收购。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2018年|liii 801 p.|共6页
会议地点
作者
Lucie-Aimee Kaffee; Hady Elsahar; Pavlos VougioukIis; Christophe Gravier; Frederique Laforest; Jonathon Hare; Elena Simperl;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Utilizing the Wikidata System to Improve the Quality of Medical Content in Wikipedia in Diverse Languages: A Pilot Study [J] . Alexander Pfundner, Tobias Sch?nberg, John Horn, Journal of medical Internet research . 2015,第5期

机译：利用Wikidata系统提高多种语言的Wikipedia中医疗内容的质量：一项初步研究
2. The Class Imbalance Problem in the Machine Learning Based Detection of Vandalism in Wikipedia across Languages [J] . Arsim Susuri, Mentor Hamiti Agni Dika Advances in Science, Technology and Engineering Systems . 2017,第1期

机译：基于机器学习的跨语言维基百科中故意破坏的检测中的类不平衡问题
3. Inclusion of Wikipedia, a language specific knowledge resource to generate and update a synset in Word Net [J] . Sunny Rai, Amita Jain, Priyank Pandey International journal of technology policy and management . 2019,第4期

机译：包含维基百科，这是一种语言特定的知识资源，可在Word Net中生成和更新同义词集
4. Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata [C] . Lucie-Aimee Kaffee, Hady Elsahar, Pavlos VougioukIis, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2018

机译：学习从Wikidata生成服务不足语言的Wikipedia摘要
5. Generating natural language summaries from multiple on-line sources: Language reuse and regeneration. [D] . Radev, Dragomir Radkov. 1999

机译：从多个在线来源生成自然语言摘要：语言重用和再生。
6. Robust clustering of languages across Wikipedia growth [O] . Kristina Ban, Matjaž Perc, Zoran Levnajić 2017

机译：整个Wikipedia增长中语言的强大聚集
7. Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata [O] . Lucie-Aimée Kaffee, Hady Elsahar, Pavlos Vougiouklis, 2018

机译：学习从Wikidata生成Wikipedia摘要，以获得Wikidata的服务不足

Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅