首页> 外文会议>Iberian Conference on Information Systems and Technologies >Towards a hybrid NLG system for Data2Text in Portuguese
【24h】

Towards a hybrid NLG system for Data2Text in Portuguese

机译:朝向葡萄牙语的Data2Text混合NLG系统

获取原文

摘要

In many new interactions with machines, such as dialogue or output using voice, there is the need to convert information internal to a system into sentences, using Data2Text systems. Trying to avoid the limitations of template-based and classical NLG methods, systems based on automatic translation have been proposed in recent years. Despite providing sentences with the important variability needed for a better interaction, this doesn't come without a cost. Contrary to template-based, these systems produce sentences with heterogeneous quality. In this paper we proposed to combine a translation based NLG system with a classifier module capable of providing information on the Intelligibility or Quality of the sentences. Sentences marked as unacceptable are replaced by template-based generated ones. This classifier module is the main focus of the paper and combines extraction of linguistic features with a classifier trained in a manually annotated corpus. Results suggest that our approach is valid as best results obtained have false positives below 8% and this metric can be even lower in practical applications, decreasing to around 3%, as the generation module produces low quality sentences at a rate lower than 30%.
机译:在许多与机器的新交互中,例如使用语音的对话或输出,需要使用Data2Text系统将系统内部的信息转换为句子。试图避免基于模板和古典NLG方法的局限性,近年来提出了基于自动翻译的系统。尽管提供了具有更好互动所需的重要变异性的句子,但不需要费用。与基于模板的相反,这些系统产生异质质量的句子。在本文中,我们建议将基于转换的NLG系统与能够提供有关句子的可懂度或质量的信息的分类器模块。标记为不可接受的句子由基于模板的生成的句子替换。该分类器模块是纸张的主要焦点,并将语言特征提取与在手动注释的语料库中训练的分类器。结果表明,我们的方法是有效的,因为所获得的最佳结果具有低于8%的假阳性,这种度量在实际应用中甚至更低,因此在低于30%的速率下产生低质量句子的速度下降至约3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号