首页> 外文会议>International Conference on Research and Innovation in Information Systems >Sequential pattern based multi document summarization — An exploratory approach
【24h】

Sequential pattern based multi document summarization — An exploratory approach

机译:基于顺序模式的多文档摘要-一种探索性方法

获取原文

摘要

Sequential Pattern Mining which aims to discover all frequent sequences of itemsets (patterns) from a large data collection has been applied in the Text Mining domain such as Text Categorization and Pattern Identification. However, in the area of Document Summarization the effort is still considered as green and exploratory. In the real world, a sentence is more than just a collection of un-ordered sequence of words, where each sentence carries their own meaning. By discovering these textual patterns is essential since the patterns can describe the text, by preserving the sequential order of the words in the document. Thus, the motivation here is to investigate the feasibility to develop a Sequential Pattern-based Summarizer model near future in order to reduce redundancy information from multiple text resources; at same time preserving the meaning of the original text document using the Semantic similarity approach. This paper reviewed some of the existing techniques in the area of multiple document summarizations to better understand the gap and issues underlying this area. By incorporating the semantic knowledge of sentences in the multiple documents is hoped to assist and alleviate the long-winding process for non-subject expert researches in trying to find the similarities and correlation between text resources.
机译:旨在从大型数据集中发现项目集(模式)的所有频繁序列的顺序模式挖掘已应用于文本挖掘领域,例如文本分类和模式识别。但是,在“文档摘要”领域,该工作仍被认为是绿色的和探索性的。在现实世界中,一个句子不仅仅是一个无序的单词序列的集合,其中每个句子都有自己的含义。发现这些文本模式至关重要,因为这些模式可以通过保留文档中单词的顺序来描述文本。因此,这里的动机是研究在不久的将来开发基于序列模式的摘要器模型的可行性,以减少来自多个文本资源的冗余信息。同时使用语义相似性方法保留原始文本文档的含义。本文回顾了多文档摘要领域中的一些现有技术,以更好地理解该领域的空白和问题。希望通过将句子的语义知识合并到多个文档中,希望有助于缓解非主题专家研究试图寻找文本资源之间的相似性和相关性的漫长过程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号