An index structure for similarity join based on high-frequency queries

机译：基于高频查询的相似联接的索引结构

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Strings databases are widely used in many applications these days. Searching for texts which are similar to query texts is necessary. Similarity join finds pairs of texts whose similarity exceeds a given threshold. Many researches have been done to reduce the time for similarity join. The filter-and-verify framework is one approach which first filters out dissimilar pairs of text and then verifies the remaining pairs. Prefix filtering is a filter-and-verify method which eliminates dissimilar pairs of texts by comparing only prefixes of the texts. However, these algorithms for similarity join disregard the frequencies of queries. Based on the data collected from Google trends explorer, some queries appear with higher frequency. This paper aims to reduce the running time for similarity join by focusing on these high-frequency queries. Based on these high-frequency queries, indices are created to facilitate these queries and any queries which are similar to them. The proposed indices and similarity join algorithm are implemented to evaluate its performance. Experiments show that the proposed method outperforms a leading similarity join algorithm - AdaptSearch - when queries are similar to a high-frequency query.

机译：如今，字符串数据库已广泛用于许多应用程序中。搜索与查询文本相似的文本是必要的。相似连接查找相似度超过给定阈值的文本对。为了减少相似连接的时间，已经进行了许多研究。过滤和验证框架是一种方法，它首先过滤掉不相似的文本对，然后验证其余的文本对。前缀过滤是一种过滤验证方法，通过仅比较文本的前缀来消除不相似的文本对。但是，这些用于相似性的算法会忽略查询的频率。根据从Google趋势浏览器收集的数据，某些查询的出现频率更高。本文旨在通过关注这些高频查询来减少相似性联接的运行时间。基于这些高频查询，创建索引以促进这些查询以及与它们类似的任何查询。实现了所提出的索引和相似性联接算法以评估其性能。实验表明，当查询与高频查询相似时，该方法优于领先的相似性连接算法AdaptSearch。

著录项

来源
《International computer science and engineering conference》|2014年|415-420|共6页
会议地点
作者
Kunanusont K.; Chongstitvatana J.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
indexing; information filtering; query processing; text analysis; AdaptSearch similarity join algorithm; Google trends explorer; filter-and-verify framework; high-frequency queries; index structure; performance evaluation; prefix filtering; query texts; strings databases; Algorithm design and analysis; Computer science; Filtering; Filtering algorithms; Indexes; Time-frequency analysis; High-frequency queries; Prefix filtering; Similarity join;

机译：索引;信息过滤;查询处理;文本分析; AdaptSearch相似性联接算法; Google趋势浏览器;过滤验证框架;高频查询;索引结构;性能评估;前缀过滤;查询文本;字符串数据库;算法设计和分析;计算机科学;过滤;过滤算法;索引;时频分析;高频查询;前缀过滤;相似性联接;

相似文献

外文文献
中文文献
专利

1. Coding-based Join Algorithms For Structural Queries On Graph-structured Xml Document [J] . Hongzhi Wang, Jianzhong Li, Wei Wang, World Wide Web . 2008,第4期

机译：图结构化Xml文档上基于结构化查询的基于编码的联接算法
2. Selective Flooding Based on Relevant Nearest-Neighbor using Query Feedback and Similarity across Unstructured Peer-to-Peer Networks [J] . Iskandar Ishak, Naomie Salim Journal of computer sciences . 2009,第3期

机译：跨非结构化对等网络的基于相关最近邻的查询查询和相似性的选择性泛洪
3. Selective Flooding Based on Relevant Nearest-Neighbor using Query Feedback and Similarity across Unstructured Peer-to-Peer Networks | Science Publications [J] . Iskandar Ishak, Naomie Salim Journal of computer sciences . 2009,第3期

机译：非相关对等网络中使用查询反馈和相似性的基于相关最近邻的选择性泛洪科学出版物
4. An index structure for similarity join based on high-frequency queries [C] . Kunanusont K., Chongstitvatana J. International computer science and engineering conference . 2014

机译：基于高频查询的相似性连接的索引结构
5. Haptic Modulation of High-frequency Vibration based on Human Perceptual Similarity [D] . 曹南 2019

机译：基于人类感知相似度的高频振动触觉调制
6. In-Network Processing of an Iceberg Join Query in Wireless Sensor Networks Based on 2-Way Fragment Semijoins [O] . Hyunchul Kang 2015

机译：基于2-Way片段半联接的无线传感器网络中Iceberg联接查询的网络内处理
7. Cluster Analysis to Find Sets of High-frequency Queries for Filtering in Similarity Join [O] . Kamolwan Kunanusont, Jaruloj Chongstitvatana 1970

机译：群集分析查找相似性过滤的高频查询集
8. Content Based Retrieval Database Management System with Support for Similarity Searching and Query Refinement [R] . Ortega-Binderberger, M. 2002

机译：基于内容的检索数据库管理系统，支持相似性搜索和查询优化

An index structure for similarity join based on high-frequency queries

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅