首页> 外文学位 >Indexing, searching, and mining large-scale visual data via structured vector quantization.
【24h】

Indexing, searching, and mining large-scale visual data via structured vector quantization.

机译:通过结构化矢量量化索引,搜索和挖掘大规模可视数据。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation is centered on indexing, searching, and mining methods for large-scale and high-dimensional visual data. While the processing to such data has been widely acknowledged to be difficult, the problem becomes more serious when we encounter "big data'', which has shifted the focus of many problems in computational science. There are urgent requirements of new approaches to processing the huge collections of visual information, e.g., images/videos on the Internet.;The study first investigates difficulties of similarity search in high-dimensional spaces, and presents a new model of local intrinsic dimensionality that better fits to similarity search problems, e.g., the nearest neighbor search. Then it turns the focus to discussions of the advantages and the problems when various structured vector quantization applying to the large-scale visual data processing. While many structured vector quantization models can be found, this study is focused to three families of them, including product quantization (PQ), residual quantization (RQ), and tree-structured vector quantization (TSVQ).;The main contributions of this work can be seen in following pipelines: 1) Two novel methods have been proposed to tackle a problem that exists in RQ for decades, and they have been used to improve residual k-means trees for scalable clustering, and to optimize two most advanced ANN search systems; 2) a new inverted index has been proposed for fast approximate nearest neighbor (ANN) search; 3) a systematic framework has been proposed for repetition mining in long video streams; 4) a tree embedding PQ model has been proposed to improve PQ codes quality for ANN search. The experimental results have shown the proposed methods are substantially better than the existing solutions in terms of trade-off among speeds, memory usage, and accuracy.
机译:本文主要针对大型和高维视觉数据的索引,搜索和挖掘方法。尽管人们普遍认为处理此类数据很困难,但是当我们遇到“大数据”时,问题变得更加严重,这已经转移了计算科学中许多问题的焦点,迫切需要新的处理方法。该研究首先研究了高维空间中相似性搜索的困难,并提出了一种新的局部固有维数模型,该模型更适合于相似性搜索问题,例如最近的邻居搜索,然后把重点放在讨论各种结构化矢量量化应用于大规模视觉数据处理时的优点和问题,尽管可以找到许多结构化矢量量化模型,但本研究集中于三个族它们包括乘积量化(PQ),残留量化(RQ)和树结构矢量量化(TSVQ)。可以在以下管道中看到这项工作的一些方面:1)提出了两种新颖的方法来解决RQ中存在的问题,几十年来,它们被用于改进可扩展聚类的剩余k-均值树,并优化了两个最先进的人工神经网络搜索系统; 2)已经提出了一种新的倒排索引,用于快速近似最近邻居(ANN)搜索; 3)已经提出了用于长视频流中重复挖掘的系统框架; 4)已经提出了树嵌入PQ模型,以提高ANN搜索的PQ代码质量。实验结果表明,在速度,内存使用和准确性之间进行权衡,所提出的方法明显优于现有解决方案。

著录项

  • 作者

    Yuan, Jiangbo.;

  • 作者单位

    The Florida State University.;

  • 授予单位 The Florida State University.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 145 p.
  • 总页数 145
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号