Temporal Corpus Summarization Using Submodular Word Coverage

机译：使用次模量词覆盖率的时态语料库摘要

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In many areas of life, we now have almost complete electronic archives reaching back for well over two decades. This includes, for example, the body of research papers in computer science, all news articles written in the US, and most people's personal email. However, we have only rather limited methods for analyzing and understanding these collections. While keyword-based retrieval systems allow efficient access to individual documents in archives, we still lack methods for understanding a corpus as a whole. In this paper, we explore methods that provide a temporal summary of such corpora in terms of landmark documents, authors, and topics. In particular, we explicitly model the temporal nature of influence between documents and re-interpret summarization as a coverage problem over words anchored in time. The resulting models provide monotone sub-modular objectives for computing informative and non-redundant summaries over time, which can be efficiently optimized with greedy algorithms. Our empirical study shows the effectiveness of our approach over several baselines.

机译：在生活的许多领域，我们现在拥有几乎完整的电子档案，可以追溯到过去的二十多年。例如，这包括计算机科学方面的研究论文，在美国撰写的所有新闻文章以及大多数人的个人电子邮件。但是，我们仅有有限的方法来分析和理解这些集合。尽管基于关键字的检索系统可以有效地访问档案中的单个文档，但我们仍然缺乏用于理解整个语料库的方法。在本文中，我们探索了根据地标文档，作者和主题提供此类语料库的时间摘要的方法。特别是，我们显式地对文档之间影响的时间性质进行建模，并将摘要重新解释为对时间锚定单词的覆盖问题。生成的模型提供了用于计算随时间推移的信息性和非冗余性汇总的单调子模块化目标，可以使用贪婪算法对其进行有效优化。我们的实证研究表明，我们的方法在多个基线上都是有效的。

著录项

来源
《ACM international conference on information and knowledge management》|2012年|754-763|共10页
会议地点
作者
Ruben Sipos; Adith Swaminathan; Pannaga Shivaswamy; Thorsten Joachims;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
summarization; temporal; submodular;

机译：总结颞;亚模;

相似文献

外文文献
中文文献
专利

1. Optimizing word set coverage for multi-event summarization [J] . Yan Jihong, Cheng Wenliang, Wang Chengyu, Journal of combinatorial optimization . 2015,第4期

机译：优化单词集覆盖范围以进行多事件汇总
2. Temporal Relationships Between Individualism-Collectivism and the Economy in Soviet Russia: A Word Frequency Analysis Using the Google Ngram Corpus [J] . Skrebyte Agne, Garnett Philip, Kendal Jeremy R. Journal of cross-cultural psychology . 2016,第9期

机译：个体主义-集体主义与苏联经济之间的时间关系：使用Google Ngram语料库的词频分析
3. Approximation Algorithms for Submodular Data Summarization with a Knapsack Constraint [J] . Kai Han, Enpei Zhang, Tong Xu, Performance evaluation review . 2021,第1期

机译：带有背包约束的子模块数据摘要的近似算法
4. Temporal Corpus Summarization Using Submodular Word Coverage [C] . Ruben Sipos, Adith Swaminathan, Pannaga Shivaswamy, ACM international conference on information and knowledge management . 2012

机译：使用子模块字覆盖范围的时间语料库摘要
5. Exploring recurrent word combinations in a business English learner corpus: A parallel corpus analysis and its curricular implications. [D] . Lopez Rodriguez, Jesus. 2006

机译：探索商务英语学习者语料库中的重复单词组合：并行语料库分析及其课程含义。
6. Gaze-enabled Egocentric Video Summarization via Constrained Submodular Maximization [O] . Jia Xut, Lopamudra Mukherjee, Yin Li, -1

机译：通过约束子模最大化实现凝视的自我中心视频汇总
7. Temporal corpus summarization using submodular word coverage [O] . Ruben Sipos, Adith Swaminathan, Pannaga Shivaswamy, 2012

机译：使用子模块词覆盖的时间语料库摘要

Temporal Corpus Summarization Using Submodular Word Coverage

摘要

著录项

相似文献

相关主题

期刊订阅