Using positional sequence patterns to estimate the selectivity of SQL LIKE queries

Aytimur Mehmet; Cakmak Ali

首页> 外文期刊>Expert systems with applications >Using positional sequence patterns to estimate the selectivity of SQL LIKE queries

【24h】

Using positional sequence patterns to estimate the selectivity of SQL LIKE queries

机译：使用位置序列模式来估计SQL的选择性，如查询

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Sequence patterns are frequently employed in many expert system applications in a wide range of domains from bioinformatics to smart homes and stock market analysis. Regular sequence patterns fail to express whether two consecutive items in a pattern are occurring right after each other in all pattern occurrences in an item database or not. Such a differentiation may be important for many intelligent system applications, for instance, to better address business questions like "should two frequently-bought together items be located right next to each other on a retail store shelf, or is it ok to place them at some distance as long as they are in the same aisle?". In this paper, we propose a novel type of sequence pattern, called "positional sequence patterns", and illustrate its application on a special expert system, i.e., the query planner/optimizer of a database management system. Positional sequence patterns allow to accommodate extra information regarding whether a frequent ordered item pair always occurs next to each other without any gap in between in all pattern occurrences. Since positional sequence patterns are not considered by the existing sequence pattern mining algorithms, we also propose an algorithm to mine them. Next, we integrate the positional sequence patterns into the selectivity estimation component of the query optimizer as an expert system application. More specifically, in the knowledgebase of the query optimizer, a histogram-like structure of positional sequence patterns are created and stored. Then, during query optimization time, these histograms are utilized to infer the selectivity of flexible text queries that are enabled by the SQL LIKE operator. In particular, the proposed selectivity estimation method employs redundant pattern elimination based on pattern information content during histogram construction, and a partitioning-based matching scheme. The experimental results on a real dataset from DBLP show that the proposed approach outperforms the state of the art by around 20% improvement in error rates. (C) 2020 Elsevier Ltd. All rights reserved.

机译：在许多专家系统应用中经常采用序列模式，这些应用在来自生物信息学到智能家庭和股票市场分析的各种域中。常规序列模式未能表达在项目数据库中的所有模式出现中彼此发生在模式中的两个连续项是否正确发生。这种差异对于许多智能系统应用可能是重要的，例如，为了更好地寻址商业问题，如“应该在零售店货架上彼此彼此正确的两个经常购买的项目，或者可以放置它们一定距离只要它们在同一个过道？“。在本文中，我们提出了一种新颖的序列模式，称为“位置序列模式”，并说明其在特殊专家系统上的应用，即数据库管理系统的查询计划者/优化器。位置序列模式允许容纳有关频繁排序的项目对是否始终彼此发生的额外信息，而不会在所有模式出现之间的任何间隙。由于现有的序列模式挖掘算法不考虑位置序列模式，因此我们还提出了一种算法来挖掘它们。接下来，我们将位置序列模式集成到查询优化器的选择性估计分量中作为专家系统应用程序。更具体地，在查询优化器的知识库中，创建和存储位置序列模式的直方图结构。然后，在查询优化时间期间，这些直方图用于推断由SQL等操作员启用的灵活文本查询的选择性。特别地，所提出的选择性估计方法基于直方图结构期间的模式信息内容，以及基于分区的匹配方案，采用冗余模式消除。 DBLP的真实数据集的实验结果表明，所提出的方法优于误差率的大约20％的提高大约20％。（c）2020 elestvier有限公司保留所有权利。

著录项

来源
《Expert systems with applications》 |2021年第3期|113762.1-113762.17|共17页
作者
Aytimur Mehmet; Cakmak Ali;
展开▼
作者单位

Istanbul Sehir Univ Dept Comp Sci Istanbul Turkey;

Istanbul Tech Univ Dept Comp Engn Istanbul Turkey;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Selectivity estimation; Histograms; Data management; Sequence pattern mining; Information content;

机译：选择性估计;直方图;数据管理;序列模式挖掘;信息内容;

相似文献

外文文献
中文文献
专利

1. SQL Injection Attack classification through the feature extraction of SQL query strings using a Gap-Weighted String Subsequence Kernel [J] . Paul R. McWhirter, Kashif Kifayat, Qi Shi, Information Security Technical Report . 2018,第JUNa期

机译：通过使用Gap-Weighted字符串子序列内核提取SQL查询字符串的特征来进行SQL Injection Attack分类
2. Estimating the selectivity of LIKE queries using pattern-based histograms [J] . MEHMET AYT?MUR, AL? ?AKMAK Turkish Journal of Electrical Engineering and Computer Sciences . 2018,第6期

机译：使用基于模式的直方图估算LIKE查询的选择性
3. Incremental sequence-based frequent query pattern mining from XML queries [J] . Guoliang Li, Jianhua Feng, Jianyong Wang, Data Mining and Knowledge Discovery . 2009,第3期

机译：从XML查询中基于增量序列的频繁查询模式挖掘
4. An object-oriented SQL (OSQL) based on association pattern query formulation [C] . Guo, M. . 1993

机译：基于关联模式查询制定的面向对象SQL（OSQL）
5. SQL query disassembler: An approach to managing the execution of large SQL queries. [D] . Meng, Yabin. 2007

机译：SQL查询反汇编程序：一种管理大型SQL查询执行的方法。
6. Testing migration patterns and estimating founding population size in Polynesia by using human mtDNA sequences [O] . Rosalind P. Murray-McIntosh, Brian J. Scrimshaw, Peter J. Hatfield, 1998

机译：通过使用人类mtDNA序列测试迁移模式并估算波利尼西亚的创始种群规模
7. SQL Injection Attack classification through the feature extraction of SQL query strings using a Gap-Weighted String Subsequence Kernel [O] . Paul R. McWhirter, Kashif Kifayat, Qi Shi, 2018

机译：SQL注入攻击分类通过使用间隙加权串随后内核的SQL查询字符串的特征提取分类

Using positional sequence patterns to estimate the selectivity of SQL LIKE queries

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅