首页> 外文会议>The semantic web >Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders
【24h】

Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders

机译:注意(语言)差距:从Wikidata为ArticlePlaceholders生成多语言Wikipedia摘要

获取原文
获取原文并翻译 | 示例

摘要

While Wikipedia exists in 287 languages, its content is unevenly distributed among them. It is therefore of utmost social and cultural importance to focus efforts on languages whose speakers only have access to limited Wikipedia content. We investigate supporting communities by generating summaries for Wikipedia articles in underserved languages, given structured data as an input. We focus on an important support for such summaries: Article-Placeholders, a dynamically generated content pages in underserved Wikipedias. They enable native speakers to access existing information in Wikidata. To extend those ArticlePlaceholders, we provide a system, which processes the triples of the KB as they are provided by the Arti-clePlaceholder, and generate a comprehensible textual summary. This data-driven approach is employed with the goal of understanding how well it matches the communities' needs on two underserved languages on the Web: Arabic, a language with a big community with disproportionate access to knowledge online, and Esperanto, an easily-acquainted, artificial language whose Wikipedia content is maintained by a small but devoted community. With the help of the Arabic and Esperanto Wikipedians, we conduct a study which evaluates not only the quality of the generated text, but also the usefulness of our end-system to any underserved Wikipedia version.
机译:维基百科以287种语言存在,但其内容分布不均。因此,将精力集中于那些仅能访问有限维基百科内容的语言是至关重要的。我们以结构化数据为输入,通过为服务不足的语言生成Wikipedia文章摘要来调查支持社区。我们重点关注对此类摘要的重要支持:文章占位符,服务不足的Wikipedia中动态生成的内容页面。它们使母语人士可以访问Wikidata中的现有信息。为了扩展那些ArticlePlaceholders,我们提供了一个系统,该系统处理Arti-clePlaceholder提供的KB的三元组,并生成可理解的文本摘要。采用这种数据驱动的方法的目的是了解它在两种网络上服务不足的语言上与社区需求的匹配程度:阿拉伯语是一种社区语言较多,在线访问知识的比例不高的语言;世界语是一种易于理解的语言,这是一种人工语言,其维基百科内容由一个小型但专门的社区维护。在阿拉伯语和世界语维基百科的帮助下,我们进行了一项研究,该研究不仅评估所生成文本的质量,而且还评估了最终系统对任何未得到充分服务的维基百科版本的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号