首页> 外文期刊>ACM transactions on multimedia computing communications and applications >Discovering Latent Topics by Gaussian Latent Dirichlet Allocation and Spectral Clustering
【24h】

Discovering Latent Topics by Gaussian Latent Dirichlet Allocation and Spectral Clustering

机译:通过高斯潜在Dirichlet分配和谱聚类发现潜在主题

获取原文
获取原文并翻译 | 示例

摘要

Today, diversifying the retrieval results of a certain query will improve customers' search efficiency. Showing the multiple aspects of information provides users an overview of the object, which helps them fast target their demands. To discover aspects, research focuses on generating image clusters from initially retrieved results. As an effective approach, latent Dirichlet allocation (LDA) has been proved to have good performance on discovering high-level topics. However, traditional LDA is designed to process textual words, and it needs the input as discrete data. When we apply this algorithm to process continuous visual images, a common solution is to quantize the continuous features into discrete form by a bag-of-visual-words algorithm. During this process, quantization error will lead to information that inevitably is lost. To construct a topic model with complete visual information, this work applies Gaussian latent Dirichlet allocation (GLDA) on the diversity issue of image retrieval. In this model, traditional multinomial distribution is substituted with Gaussian distribution to model continuous visual features. In addition, we propose a two-phase spectral clustering strategy, called dual spectral clustering, to generate clusters from region level to image level. The experiments on the challenging landmarks of the DIV400 database show that our proposal improves relevance and diversity by about 10% compared to traditional topic models.
机译:如今,多样化某个查询的检索结果将提高客户的搜索效率。显示信息的多个方面可为用户提供对象的概述,从而帮助他们快速确定需求。为了发现方面,研究着重于根据最初检索到的结果生成图像簇。作为一种有效的方法,潜在的狄利克雷分配(LDA)已被证明在发现高级主题方面表现良好。但是,传统的LDA被设计为处理文本单词,并且需要输入作为离散数据。当我们将此算法用于处理连续的视觉图像时,一种常见的解决方案是通过视觉词袋算法将连续特征量化为离散形式。在此过程中,量化误差将导致不可避免地丢失信息。为了构建具有完整视觉信息的主题模型,本文将高斯隐式Dirichlet分配(GLDA)应用于图像检索的多样性问题。在此模型中,用高斯分布代替传统的多项式分布,以对连续的视觉特征进行建模。此外,我们提出了一种称为双光谱聚类的两阶段光谱聚类策略,以生成从区域级别到图像级别的聚类。在DIV400数据库具有挑战性的里程碑上进行的实验表明,与传统主题模型相比,我们的提案将相关性和多样性提高了约10%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号