...
首页> 外文期刊>Data & Knowledge Engineering >Genetic algorithms for approximate similarity queries
【24h】

Genetic algorithms for approximate similarity queries

机译:用于近似相似性查询的遗传算法

获取原文
获取原文并翻译 | 示例
           

摘要

Algorithms to query large sets of simple data (composed of numbers and small character strings) are constructed to retrieve the exact answer, retrieving every relevant element, so the answer said to be exact. Similarity searching over complex data is much more expensive than searching over simple data. Moreover, comparison operations over complex data usually consider features extracted from each element, instead of the elements themselves. Thus, even if an algorithm retrieves an exact answer, it is 'exact' regarding the extracted features, not regarding the original elements themselves. Therefore, trading exact answering with query time response can be worthwhile. In this work we developed two search strategies based on genetic algorithms to allow retrieving approximate data indexed by Metric Access Methods (MAM) within a limited, user-defined, amount of time. These strategies allow implementing algorithms to answer both range and k-nearest neighbor queries, and allow also to estimate the precision obtained for the approximate answer. Experimental evaluation shows that very good results (corresponding to what the user would expect) can be obtained in a fraction of the time required to obtain the exact answer.
机译:查询大型简单数据集(由数字和小字符串组成)的算法被构造为检索准确的答案,并检索每个相关元素,因此答案是准确的。在复杂数据上进行相似性搜索比在简单数据上进行搜索要昂贵得多。此外,对复杂数据的比较操作通常会考虑从每个元素中提取的特征,而不是元素本身。因此,即使算法检索到确切的答案,它对于提取的特征也是“精确”的,而不是原始元素本身。因此,用查询时间响应来交换准确答案可能是值得的。在这项工作中,我们开发了两种基于遗传算法的搜索策略,以允许在有限的用户定义时间内检索由Metric Access Methods(MAM)索引的近似数据。这些策略允许实施算法来回答范围和k最近邻居查询,并且还允许估计为近似答案而获得的精度。实验评估表明,可以在获得准确答案所需的时间的一小部分内获得非常好的结果(与用户的期望相对应)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号