首页> 外文会议>International conference on knowledge science, engineering and management >A Document Modeling Method Based on Deep Generative Model and Spectral Hashing
【24h】

A Document Modeling Method Based on Deep Generative Model and Spectral Hashing

机译:基于深度生成模型和谱散列的文档建模方法

获取原文

摘要

One of the most critical challenges in document modeling is the efficiency of the extraction of the high level representations. In this paper, a document modeling method based on deep generative model and spectral hashing is proposed. Firstly, dense and low-dimensional features are well learned from a deep generative model with word-count vectors as its input. And then, these features are used for training a spectral hashing model to compress a novel document into compact binary code, and the Hamming distances between these codewords correlate with semantic similarity. Taken together, retrieving similar neighbors is then done simply by retrieving all items with codewords within a small Hamming distance of the codewords for the query, which can be exceedingly fast and shows superior performance compared with conventional methods as well as guarantees accessibility to the large-scale dataset.
机译:文档建模中最关键的挑战之一是提取高级表示形式的效率。本文提出了一种基于深度生成模型和频谱哈希的文档建模方法。首先,从深度生成模型中以字数向量作为输入来很好地学习密集和低维特征。然后,这些特征用于训练频谱哈希模型以将新文档压缩为紧凑的二进制代码,并且这些代码字之间的汉明距离与语义相似性相关。综上所述,检索相似邻居的方法很简单,只需检索所有代码字的汉明距离在查询的汉明距离以内的所有代码字,这可以非常快,并且与传统方法相比,性能优越,并且可以确保对大型代码的可访问性。规模数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号