首页> 外文会议>International Conference of Computer and Information Technology >Automatic Bangla Text Summarization Using Term Frequency and Semantic Similarity Approach
【24h】

Automatic Bangla Text Summarization Using Term Frequency and Semantic Similarity Approach

机译:使用术语频率和语义相似性方法自动孟加拉文本摘要

获取原文

摘要

With the increasing amount of data within the cloud, it is harder to get the expected one. This leads to the idea of text summarization. Automatic text summarization is a tool for summarizing textual data into a short and concise piece of information via which people can have the idea about the content. Several approaches are introduced but there are a little amount of work has been done on Bangla text summarizing techniques due to some different and multifaceted structure of Bangla language. This paper illustrates the implementation of term frequency and semantic sentence similarity based summarizing approaches to summarize a single Bangla document. Removing stopwords, noisy words, lemmatization, tokenization has been done beforehand. Both of these methods return a bunch of top-ranked sentences to create a summary. The rank of a sentence is determined by the term frequency for the first approach and the sentence similarity for the second approach. The experimental result shows a favorable outcome for both of the approaches. Further improvements of these approaches certainly will return an enchanting outcome.
机译:随着云中的数据越来越多,可以获得预期的数据。这导致了文本摘要的想法。自动文本摘要是一个工具,用于将文本数据归入为简短的信息,通过哪些人可以了解内容的想法。介绍了几种方法,但由于Bangla语言的一些不同和多方面结构,孟加拉文本总结技术已经完成了一点工作。本文说明了基于总结一个庞大的孟加拉文档的术语频率和语义句子相似性的实现。删除秒表,嘈杂的单词,lemmatization,令牌化已经完成。这两种方法都返回一堆排名句子以创建摘要。句子的等级由第一方法的术语频率和第二种方法的句子相似度决定。实验结果表明了两种方法的有利结果。这些方法的进一步改善肯定会恢复迷人的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号