首页> 外文期刊>Journal of Engineering and Computer Innovations >New approaches to automatic headline generation for Arabic documents
【24h】

New approaches to automatic headline generation for Arabic documents

机译:自动生成阿拉伯文标题的新方法

获取原文
       

摘要

A headline is considered a condensed summary of a document. The necessity for automatic headline generation has been on the rise due to the need to handle a huge number of documents, which is a tedious and time-consuming process. Instead of reading every document, the headline can be used to decide which ones contain important and relevant information. There are two major approaches to automatic headline generation. The first is linguistic, in which the knowledge about the structure of the language itself is considered. The second approach is statistical and it comprises all quantitative approaches to automated language processing. However, the Arabic language has a different statistical structure than the English language, and requires special treatment to generate Arabic headlines, especially when there is no dedicated technique for the Arabic language. Therefore, two new statistical methods in automatic headline generation have been developed to create representative headlines for textual documents in the Arabic language. The first is an extractive method based on character cross-correlation, and the second one is an abstractive method based on the hidden Markov model (HMM). The extractive method achieved ROUGE-L of (0.1938) and the HMM method achieved ROUGE-L of (0.2332). In addition, both techniques were assessed via human examiners who evaluated the resulting headlines.
机译:标题被视为文档的摘要。由于需要处理大量文档,因此自动标题生成的必要性正在上升,这是一个繁琐且耗时的过程。标题可以用来确定包含重要和相关信息的文档,而不是阅读每个文档。自动标题生成有两种主要方法。第一种是语言学,其中考虑了有关语言本身结构的知识。第二种方法是统计方法,它包括自动语言处理的所有定量方法。但是,阿拉伯语的统计结构与英语不同,并且需要特殊处理才能产生阿拉伯语标题,尤其是在阿拉伯语没有专门技术的情况下。因此,已经开发了两种新的自动标题生成统计方法,以创建阿拉伯文文本文档的代表性标题。第一种是基于字符互相关的提取方法,第二种是基于隐马尔可夫模型(HMM)的抽象方法。提取方法的ROUGE-L为(0.1938),HMM方法的ROUGE-L为(0.2332)。另外,这两种技术都是通过对结果标题进行了评估的人类检查员进行评估的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号