首页> 外文会议>International Workshop on Computational Processing of the Portuguese Language >Applying Lexical-Conceptual Knowledge for Multilingual Multi-document Summarization
【24h】

Applying Lexical-Conceptual Knowledge for Multilingual Multi-document Summarization

机译:应用词汇概念知识对多语言多文件摘要

获取原文

摘要

We define Multilingual Multi-Document Summarization (MMDS) as the process of identifying the main information of a cluster with (at least) two texts, one in the user's language and one in a foreign language, and presenting it as a summary in the user's language. Although it is a relevant task due to the increasing amount of on-line information in different languages, there are only baselines for (Brazilian) Portuguese, which apply machine-translation to obtain a monolingual input and superficial features for sentence extraction. We report our investigation on the application of conceptual frequency measure to build a summary in Portuguese from a bilingual cluster (Portuguese and English). The methods tackle two additional challenges: using Princeton WordNet for nouns annotation and applying MT to translate selected sentences in English to Portuguese. The experiments were performed using a corpus of 20 clusters, and show that lexical-conceptual knowledge improves the linguistic quality and informativeness of extracts.
机译:我们将多语言多文件摘要(MMDS)定义为识别(至少)两个文本,其中一个文本的主要信息,一个在用户的语言中,一个以外语为单语言,并将其作为用户的摘要呈现语。虽然由于不同语言的在线信息的数量增加,但只有(巴西)葡萄牙语的基线,葡萄牙语的基本链应用于用于句子提取的单晶输入和浅表特征。我们报告了我们对概念频率措施的应用调查,从双语群中(葡萄牙语和英语)在葡萄牙语中建立概要。这些方法解决了两个额外的挑战:使用普林斯顿Wordnet进行名词注释和应用MT以将英语翻译成葡萄牙语的选定句子。使用20个簇的语料进行实验,并表明词汇概念知识提高了提取物的语言质量和信息性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号