首页> 外文期刊>Brazilian Computer Society. Journal >Portuguese text generation using factored language models
【24h】

Portuguese text generation using factored language models

机译:使用分解语言模型生成葡萄牙语文本

获取原文
           

摘要

As in many other natural language processing (NLP) fields, the use of statistical methods is now part of mainstream natural language generation (NLG). In the development of systems of this kind, however, there is the issue of data sparseness, a problem that is particularly evident in the case of morphologically-rich languages such as Portuguese. This work presents a shallow surface realisation system that makes use of factored language models (FLMs) of Portuguese to overcome some of these difficulties. The system combines FLMs trained on a large corpus with a number of NLP resources that have been made publicly available by the Brazilian NLP research community in recent years, such as corpora, dictionaries, thesauri and others. Our FLM-based approach to surface realisation has been successfully applied to the generation of Brazilian newspapers headlines, and the results are shown to outperform a number of statistical and non-statistical baseline systems alike.
机译:与其他许多自然语言处理(NLP)领域一样,统计方法的使用现在已成为主流自然语言生成(NLG)的一部分。然而,在这种系统的开发中,存在数据稀疏的问题,这在形态丰富的语言(例如葡萄牙语)中尤为明显。这项工作提出了一种浅层表面实现系统,该系统利用葡萄牙语的因式语言模型(FLM)来克服其中的一些困难。该系统将受大型语料库训练的FLM与巴西NLP研究社区近年来公开提供的大量NLP资源结合起来,例如语料库,词典,叙词表等。我们基于FLM的表面实现方法已成功应用于巴西报纸的头条新闻,结果表明其性能优于许多统计和非统计基准系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号