首页> 外文会议>International conference on computational linguistics >Flexible Japanese Sentence Compression by Relaxing Unit Constraints
【24h】

Flexible Japanese Sentence Compression by Relaxing Unit Constraints

机译:通过放宽单位约束来灵活地进行日语句子压缩

获取原文

摘要

Sentence compression is important in a wide range of applications in natural language processing. Previous approaches of Japanese sentence compression can be divided into two groups. Word-based methods extract a subset of words from a sentence to shorten it, while bunsetsu-based methods extract a subset of bunsetsu (where a bunsetsu is a text unit that consists of content words and following function words). Basically, bunsetsu-based methods perform better than word-based methods. However, bunsetsu-based methods have the disadvantage that they cannot drop unimportant words from each bunsetsu because they have to follow constraints under which each bunsetsu is treated as a unit. In this paper, we propose a novel compression method to overcome this disadvantage. Our method relaxes the constraints using Lagrangian relaxation and shortens each bunsetsu if it contains unimportant words. Experimental results show that our method effectively compresses a sentence while preserving its important information and grammaticality.
机译:句子压缩在自然语言处理的广泛应用中很重要。日语句子压缩的先前方法可以分为两组。基于单词的方法从句子中提取单词的子集以将其缩短,而基于Bunsetsu的方法则提取Bunsetsu的子集(其中bunsetsu是由内容词和后继功能词组成的文本单元)。基本上,基于bunsetsu的方法比基于单词的方法执行得更好。但是,基于Bunsetsu的方法的缺点是它们不能从每个Bunsetsu中删除不重要的单词,因为它们必须遵循将每个Bunsetsu视为一个单元的约束。在本文中,我们提出了一种新颖的压缩方法来克服这一缺点。我们的方法使用拉格朗日松弛法来放宽约束,并在每个Bunsetsu包含不重要的单词时将其缩短。实验结果表明,该方法有效地压缩了句子,同时保留了重要的信息和语法性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号