【24h】

Topic Chains for Understanding a News Corpus

机译:理解新闻语料库的主题链

获取原文

摘要

The Web is a great resource and archive of news articles for the world. We present a framework, based on probabilistic topic modeling, for uncovering the meaningful structure and trends of important topics and issues hidden within the news archives on the Web. Central in the framework is a topic chain, a temporal organization of similar topics. We experimented with various topic similarity metrics and present our insights on how best to construct topic chains. We discuss how to interpret the topic chains to understand the news corpus by looking at long-term topics, temporary issues, and shifts of focus in the topic chains. We applied our framework to nine months of Korean Web news corpus and present our findings.
机译:网络是世界新闻文章的伟大资源和档案。我们介绍了一种基于概率主题建模的框架,用于揭示网上新闻档案中隐藏的重要主题和问题的有意义的结构和趋势。框架中的中心是一个主题链,是一个类似主题的时间组织。我们尝试了各种主题相似度指标,并展示了对如何最好构建主题链的见解。我们讨论如何通过查看主题链中的长期主题,临时问题和焦点转变来解释主题链以了解新闻语料库。我们将框架应用于韩国网络新闻语料库的九个月,并展示了我们的调查结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号