【24h】

Discovering Trends in Brand Interest through Topic Models

机译:通过主题模型发现品牌兴趣的趋势

获取原文

摘要

Topic Modeling is a well-known unsupervised learning technique used when dealing with text data. It is used to discover latent patterns, called topics, in a collection of documents (corpus). This technique provides a convenient way to retrieve information from unclassified and unstructured text. Topic Modeling tasks have been performed for tracking events/topics/trends in different domains such as academic, public health, marketing, news, and so on. In this paper, we propose a framework for extracting topics from a large dataset of short messages, for brand interest tracking purposes. The framework consists training LDA topic models for each brand using time intervals, and then applying the model on aggregated documents. Additionally, we present a set of preprocessing tasks that helped to improve the topic models and the corresponding outputs. The experiments demonstrate that topic modeling can successfully track people's discussions on Social Networks even in massive datasets, and capture those topics spiked by real-life events.
机译:主题建模是在处理文本数据时使用的众所周知的无监督学习技术。它用于发现潜在模式,称为主题,在文件集合(语料库)中。该技术提供了一种方便的方法来检索来自未分类和非结构化文本的信息。已经针对跟踪不同领域的事件/主题/趋势进行了主题建模任务,例如学术,公共卫生,营销,新闻等。在本文中,我们提出了一个框架,用于从短消息的大型数据集中提取主题,以获得品牌兴趣跟踪目的。框架使用时间间隔为每个品牌进行培训LDA主题模型,然后在聚合文档上应用模型。此外,我们展示了一组有助于改善主题模型和相应输出的预处理任务。实验表明,即使在大规模的数据集中,主题建模也可以成功跟踪人们对社交网络的讨论,并捕获由现实生活事件飙升的主题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号