首页> 外文学位 >Indexing, searching, and mining large-scale visual data via structured vector quantization.

【24h】

Indexing, searching, and mining large-scale visual data via structured vector quantization.

机译：通过结构化矢量量化索引，搜索和挖掘大规模可视数据。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This dissertation is centered on indexing, searching, and mining methods for large-scale and high-dimensional visual data. While the processing to such data has been widely acknowledged to be difficult, the problem becomes more serious when we encounter "big data'', which has shifted the focus of many problems in computational science. There are urgent requirements of new approaches to processing the huge collections of visual information, e.g., images/videos on the Internet.;The study first investigates difficulties of similarity search in high-dimensional spaces, and presents a new model of local intrinsic dimensionality that better fits to similarity search problems, e.g., the nearest neighbor search. Then it turns the focus to discussions of the advantages and the problems when various structured vector quantization applying to the large-scale visual data processing. While many structured vector quantization models can be found, this study is focused to three families of them, including product quantization (PQ), residual quantization (RQ), and tree-structured vector quantization (TSVQ).;The main contributions of this work can be seen in following pipelines: 1) Two novel methods have been proposed to tackle a problem that exists in RQ for decades, and they have been used to improve residual k-means trees for scalable clustering, and to optimize two most advanced ANN search systems; 2) a new inverted index has been proposed for fast approximate nearest neighbor (ANN) search; 3) a systematic framework has been proposed for repetition mining in long video streams; 4) a tree embedding PQ model has been proposed to improve PQ codes quality for ANN search. The experimental results have shown the proposed methods are substantially better than the existing solutions in terms of trade-off among speeds, memory usage, and accuracy.

机译：本文主要针对大型和高维视觉数据的索引，搜索和挖掘方法。尽管人们普遍认为处理此类数据很困难，但是当我们遇到“大数据”时，问题变得更加严重，这已经转移了计算科学中许多问题的焦点，迫切需要新的处理方法。该研究首先研究了高维空间中相似性搜索的困难，并提出了一种新的局部固有维数模型，该模型更适合于相似性搜索问题，例如最近的邻居搜索，然后把重点放在讨论各种结构化矢量量化应用于大规模视觉数据处理时的优点和问题，尽管可以找到许多结构化矢量量化模型，但本研究集中于三个族它们包括乘积量化（PQ），残留量化（RQ）和树结构矢量量化（TSVQ）。可以在以下管道中看到这项工作的一些方面：1）提出了两种新颖的方法来解决RQ中存在的问题，几十年来，它们被用于改进可扩展聚类的剩余k-均值树，并优化了两个最先进的人工神经网络搜索系统； 2）已经提出了一种新的倒排索引，用于快速近似最近邻居（ANN）搜索； 3）已经提出了用于长视频流中重复挖掘的系统框架； 4）已经提出了树嵌入PQ模型，以提高ANN搜索的PQ代码质量。实验结果表明，在速度，内存使用和准确性之间进行权衡，所提出的方法明显优于现有解决方案。

著录项

作者
Yuan, Jiangbo.;
展开▼
作者单位

The Florida State University.;

展开▼
授予单位 The Florida State University.;
学科 Computer science.
学位 Ph.D.
年度 2014
页码 145 p.
总页数 145
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Similarity Searching in Databases of Flexible 3D Structures Using Autocorrelation Vectors Derived from Smoothed Bounded Distance Matrices [J] . Nicholas Rhodes, David E.Clark, Peter Willett Journal of chemical information and modeling . 2006,第2期

机译：使用自平滑有界距离矩阵得出的自相关向量在柔性3D结构数据库中进行相似性搜索
2. Mining the chemical quarry with joint chemical probes: an application of latent semantic structure indexing (LaSSI) and TOPOSIM (Dice) to chemical database mining. [J] . Singh SB, Sheridan RP, Fluder EM, Journal of Medicinal Chemistry . 2001,第10期

机译：使用联合化学探针挖掘化学采石场：潜在语义结构索引（LaSSI）和TOPOSIM（Dice）在化学数据库挖掘中的应用。
3. Hierarchical Visual Data Mining for Large-Scale Data [J] . Matthew Ward, Wei Peng, Xiaoning Wang Computational statistics . 2004,第1期

机译：大规模数据的分层可视数据挖掘
4. Mining and Visualizing Associations of Concepts on a Large-Scale Unstructured Data [C] . Reza Sadoddin, Osvaldo Driollet IEEE International Conference on Big Data Computing Service and Applications . 2016

机译：大型非结构化数据上概念关联的挖掘和可视化
5. RUPEE: A Big Data Approach to Indexing and Searching Protein Structures [D] . Ayoub, Ronald. 2021

机译：卢比：索引和搜索蛋白质结构的大数据方法
6. Large-Scale Data Mining of Rapid Residue Detection Assay Data From HTML and PDF Documents: Improving Data Access and Visualization for Veterinarians [O] . Majid Jaberi-Douraki, Soudabeh Taghian Dinani, Nuwan Indika Millagaha Gedara, 2021

机译：来自HTML和PDF文件的快速残留检测测定数据的大规模数据挖掘：改善兽医的数据访问和可视化
7. Tree-Structured Data Processing Platform for Large-Scale Data Mining [O] . Kohsuke Yanai, Ryoichi Ueda, Sagawa Nobutoshi 2011

机译：用于大型数据挖掘的树结构数据处理平台
8. Two Papers on Range Searching: A Survey of Algorithms and Data Structures for Range Searching. Efficient Worst-Case Data Structures for Range Searching. [R] . bentley,jon louis friedman,jerome h. 1978

机译：关于范围搜索的两篇论文：范围搜索的算法和数据结构综述。用于范围搜索的高效最坏情况数据结构。

Indexing, searching, and mining large-scale visual data via structured vector quantization.

摘要

著录项

相似文献

相关主题

期刊订阅