Extracting Hot spots of Topics from Time Stamped Documents

机译：从时间戳文档中提取主题的热点

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Identifying time periods with a burst of activities related to a topic has been an important problem in analyzing time-stamped documents. In this paper, we propose an approach to extract a >hot spot of a given topic in a time-stamped document set. Topics can be >basic, containing a simple list of keywords, or >complex. Logical relationships such as >and, >or, and >not are used to build complex topics from basic topics. A concept of >presence measure of a topic based on fuzzy set theory is introduced to compute the amount of information related to the topic in the document set. Each interval in the time period of the document set is associated with a numeric value which we call the >discrepancy score. A high discrepancy score indicates that the documents in the time interval are more focused on the topic than those outside of the time interval. A >hot spot of a given topic is defined as a time interval with the highest discrepancy score. We first describe a naive implementation for extracting hot spots. We then construct an algorithm called >EHE (Efficient Hot Spot Extraction) using several efficient strategies to improve performance. We also introduce the notion of a >topic DAG to facilitate an efficient computation of presence measures of complex topics. The proposed approach is illustrated by several experiments on a subset of the TDT-Pilot Corpus and DBLP conference data set. The experiments show that the proposed EHE algorithm significantly outperforms the naive one, and the extracted hot spots of given topics are meaningful.

著录项

期刊名称 other
作者
Wei Chen; Parvathi Chundi;
展开▼
作者单位

展开▼
年(卷),期 -1(70),7
年度 -1
页码 642–660
总页数 37
原文格式 PDF
正文语种
中图分类
关键词
scan statistic text mining hot spots topics;

机译：扫描统计;文本挖掘;热点;主题;

相似文献

外文文献
中文文献
专利

1. Extracting hot spots of topics from time-stamped documents [J] . Wei Chen, Parvathi Chundi Data & Knowledge Engineering . 2011,第7期

机译：从带时间戳的文档中提取主题的热点
2. Discovering Hierarchical Topic Evolution in Time-Stamped Documents [J] . Jun Song, Yu Huang, Xiang Qi, Journal of the American Society for Information Science and Technology . 2016,第4期

机译：在带时间戳的文档中发现分层主题的演变
3. Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents [J] . Pattern Analysis and Machine Intelligence, IEEE Transactions on . 2010,第6期

机译：带时间戳的文档中主题的分层贝叶斯建模
4. Extracting hot spots of basic and complex topics from time stamped documents [C] . Wei Chen, Chundi P. Computational Intelligence and Data Mining, 2009. CIDM '09 . 2009

机译：从带时间戳的文档中提取基本和复杂主题的热点
5. A MapReduce Algorithm for Finding Hotspots of Topics from Time Stamped Documents [D] . Ashokan, Ashwathy P. 2013

机译：一种从时间戳文档中查找主题热点的MapReduce算法
6. Improved Detection of Candida sp. fks Hot Spot Mutants by Using the Method of the CLSI M27-A3 Document with the Addition of Bovine Serum Albumin [O] . Guillermo Garcia-Effron, Steven Park, David S. Perlin 2011

机译：改进的念珠菌检测。通过使用CLSI M27-A3文件的方法添加牛血清白蛋白来筛选fks热点突变体
7. Extracting hot spots of topics from time-stamped documents [O] . Wei Chen, Parvathi Chundi 2011

机译：从时间戳文档中提取主题的热点

Extracting Hot spots of Topics from Time Stamped Documents

摘要

著录项

相似文献

相关主题

期刊订阅