【24h】

Automatic text summarization of Wikipedia articles

机译:维基百科文章的自动文本摘要

获取原文
获取原文并翻译 | 示例

摘要

The main objective of a text summarization system is to identify the most important information from the given text and present it to the end users. In this paper, Wikipedia articles are given as input to system and extractive text summarization is presented by identifying text features and scoring the sentences accordingly. The text is first pre-processed to tokenize the sentences and perform stemming operations. We then score the sentences using the different text features. Two novel approaches implemented are using the citations present in the text and identifying synonyms. These features along with the traditional methods are used to score the sentences. The scores are used to classify the sentence to be in the summary text or not with the help of a neural network. The user can provide what percentage of the original text should be in the summary. It is found that scoring the sentences based on citations gives the best results.
机译:文本摘要系统的主要目标是从给定的文本中识别最重要的信息,并将其呈现给最终用户。在本文中,将Wikipedia文章作为系统输入,并通过识别文本特征并相应地对句子评分为摘要性文本摘要。首先对文本进行预处理,以标记句子并执行词干操作。然后,我们使用不同的文本功能对句子评分。实施的两种新颖方法是使用文本中的引文和识别同义词。这些功能与传统方法一起用于对句子评分。分数用于在神经网络的帮助下将句子分类为摘要文本还是不分类。用户可以提供摘要中原始文本的百分比。发现基于引用对句子评分可以得到最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号