首页> 外文会议>International Conference on Advanced Electronic Materials, Computers and Software Engineering >Chinese Sentence Compression Algorithm Based on Deep Analysis of Sentence Hierarchy in Multiple Application Scenarios
【24h】

Chinese Sentence Compression Algorithm Based on Deep Analysis of Sentence Hierarchy in Multiple Application Scenarios

机译:基于句子层次分析的多应用场景中文压缩算法

获取原文

摘要

In recent years, people have paid more and more attention to how to obtain high-value information from massive data, and the demand for efficiency and accuracy has also increased. As a basic technology to solve this problem, sentence compression technology can be widely used in machine translation, question answering system, and generating abstracts. Given that the proposed strategy for Chinese sentence compression is less than English sentence compression, the actual value of Chinese-related sentence compression is self-evident. In the current practical context, the problems encountered in sentence compression are as follows: First, the traditional sentence compression methods are mostly based on parallel corpora of "original sentences-compressed sentences". Parallel corpora are usually difficult to build or obtain. The corpus is even more scarce. Second, sentence compression is relatively rigid, and you cannot choose a flexible compression strategy based on the application. Based on the above practical background, this paper proposes a Chinese sentence compression algorithm that does not rely on parallel corpus and deep analysis of sentence hierarchy in multiple application scenarios. The three compression methods are compared: manual compression, only compression using rules, and compression based on deep analysis of sentence hierarchies. The experimental results show that the Chinese sentence compression algorithm based on deep analysis of sentence hierarchies can be used to some extent according to different application scenarios, reducing or increasing the compression ratio flexibly and appropriately within the acceptable range to obtain a better / slightly lower amount of information and grammatical retention. The compression effect under the comprehensive index is significantly better than the previous two compression methods, and can bring more flexible and realistic compression effects according to the actual application scenario.
机译:近年来,人们越来越关注如何从海量数据中获取高价值信息,并且对效率和准确性的需求也在增加。作为解决该问题的基本技术,句子压缩技术可广泛应用于机器翻译,问答系统和摘要生成中。考虑到所提出的中文句子压缩策略比英文句子压缩少,与中文相关的句子压缩的实际价值是不言而喻的。在当前的实际情况下,句子压缩中遇到的问题如下:首先,传统的句子压缩方法主要基于“原始句子压缩句子”的并行语料库。并行语料库通常很难建立或获得。语料库更加稀缺。其次,句子压缩是相对严格的,您不能根据应用程序选择灵活的压缩策略。基于上述实际背景,提出一种不依赖并行语料库的汉语句子压缩算法,并且在多种应用场景下对句子层次进行深入分析。比较了三种压缩方法:手动压缩,仅使用规则的压缩以及基于句子层次结构深入分析的压缩。实验结果表明,基于句子层次分析的中文句子压缩算法可以根据不同的应用场景在一定程度上使用,在可接受的范围内灵活适当地降低或增加压缩率,得到更好/更低的压缩量。信息和语法保留。综合指标下的压缩效果明显优于前两种压缩方法,并可以根据实际应用场景带来更加灵活,逼真的压缩效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号