Distance-based indexing: Observations, applications, and improvements.

机译：基于距离的索引：观察，应用和改进。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multidimensional indexing has long been an active research problem in computer science. Most solutions involve the mapping of complex data types to high-dimensional vectors of fixed length and applying either Spatial Access Methods (SAMs) or Point Access Methods (PAMs) to the vectorized data.; In more recent times, however, this approach has found its limitations. Much of the current data is either difficult to map to a fixed-length vector (such as arbitrary length strings), or maps only successfully to a very high number of dimensions. In both cases, Distance-Based Indexing serves as an attractive alternative, relying only on the pairwise distance information of data items to build indices that offer efficient similarity search retrieval.; In this work, distance-based indexing is approached first in a general fashion, where a framework is laid out that encompasses both distance-based indexing methods as well as SAMs and PAMs. Shared properties of various seemingly unrelated data structures can be exploited, as is shown by the presentation of a single (and optimal) search algorithm that works on a variety of trees for a variety of different search types.; The motivation for distance-based indexing is then shown via an application of indexing strings (biological sequences, to be exact). By simply showing that a distance function satisfies the properties of a metric, it is illustrated that many forms of data, with various distribution characteristics can successfully be indexed with distance-based indexing.; Finally, a probabilistic approach towards indexing leads to an improved tree construction algorithm, as well as an information based search algorithm that searches the information stored in any data structure, regardless of the form (i.e., whether the structure is a tree or a matrix, the algorithm performs equally well).

机译：多维索引长期以来一直是计算机科学中的一个活跃的研究问题。大多数解决方案都涉及将复杂数据类型映射到固定长度的高维向量，并将空间访问方法（SAM）或点访问方法（PAM）应用于矢量化数据。但是，在最近一段时间，这种方法已经发现了其局限性。许多当前数据要么很难映射到固定长度的向量（例如任意长度的字符串），要么很难成功地映射到很多维。在这两种情况下，基于距离的索引都是一种有吸引力的选择，它仅依赖于数据项的成对距离信息来构建提供有效相似搜索检索的索引。在这项工作中，首先以通用方式处理基于距离的索引，在此框架中，提出了一个框架，其中包括基于距离的索引方法以及SAM和PAM。可以利用各种看似不相关的数据结构的共享属性，如单个（最优）搜索算法的呈现所显示的那样，该算法对各种不同搜索类型的各种树都起作用。然后，通过应用索引字符串（准确地说是生物序列）来显示基于距离的索引的动机。通过简单地显示距离函数满足度量标准的属性，可以说明具有各种分布特征的多种形式的数据可以成功地通过基于距离的索引进行索引。最终，一种采用概率方法进行索引的方法会导致改进的树结构算法以及基于信息的搜索算法，该算法搜索存储在任何数据结构中的信息，而不论其形式（即结构是树还是矩阵，该算法的效果同样好）。

著录项

作者
Tasan, Murat.;
展开▼
作者单位

Case Western Reserve University.;

展开▼
授予单位 Case Western Reserve University.;
学科 Computer Science.
学位 Ph.D.
年度 2006
页码 198 p.
总页数 198
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Targeted genome modification technologies and their applications in crop improvements. (Special issue: Plant science and biotechnology in China (Volume I).) [J] . Chen KunLing, Gao CaiXia Plant Cell Reports . 2014,第4期

机译：靶向基因组修饰技术及其在作物改良中的应用。（特刊：中国植物科学与生物技术（第一卷）。）
2. Applications of positron emission tomography/computed tomography image fusion in clinical positron emission tomography-clinical use, interpretation methods, diagnostic improvements. [J] . Cohade C, Wahl RL Seminars in Nuclear Medicine . 2003,第3期

机译：正电子发射断层扫描/计算机断层扫描图像融合在临床正电子发射断层扫描中的应用-临床用途，解释方法，诊断改进。
3. Distance-based weighted sparse representation to classify motor imagery EEG signals for BCI applications [J] . S. R. Sreeja, Himanshu, Debasis Samanta Multimedia Tools and Applications . 2020,第19a20期

机译：基于距离的加权稀疏表示，用于对BCI应用程序进行分类电机图像EEG信号
4. Distance-based indexing of residue contacts for protein structure retrieval and alignment [C] . Sacan Ahmet, Toroslu I. Hakki, Ferhatosmanoglu Hakan IEEE International Conference on BioInformatics and BioEngineering . 2008

机译：蛋白质结构检索与对准残留触头的基于距离的索引
5. Distance-based indexing and its applications in bioinformatics [D] . Mao, Rui 2007

机译：基于距离的索引及其在生物信息学中的应用
6. Genetic Background Comparison Using Distance-based Regression with Applications in Population Stratification Evaluation and Adjustment [O] . Qizhai Li, Sholom Wacholder, David J. Hunter, -1

机译：遗传背景相比在使用基于距离的回归在人群分层评价和调整应用程序
7. NIR Spectroscopy and Microspectroscopy Analysis of Intact Soybean Seeds for Food Applications through Composition Improvements. Version 2.0 [O] . Tiefeng You, I. C. Baianu 2011

机译：通过改善成分，对食品用完整大豆种子进行近红外光谱和显微光谱分析。版本2.0
8. Ensemble Single Column Modeling in the Tropics: Derivation of Observed Forcing Data Sets, Estimation of Observation Uncertainty and Application to Parametrization Improvements. [R] . Jakob, C., May, P., Seed, A., 2012

机译：热带地区的集合单柱建模：观测强迫数据集的推导，观测不确定度的估计及其在参数化改进中的应用。

Distance-based indexing: Observations, applications, and improvements.

摘要

著录项

相似文献

相关主题

期刊订阅