...
首页> 外文期刊>Knowledge and information systems >A compact multi-resolution index for variable length queries in time series databases
【24h】

A compact multi-resolution index for variable length queries in time series databases

机译:紧凑的多分辨率索引,用于时间序列数据库中的可变长度查询

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We study the problem of searching similar patterns in time series data for variable length queries. Recently, a multi-resolution indexing technique (MRI) was proposed in (Kahveci and Singh, in proceedings of the international conference on data engineering, pp. 273-282, 2001; Kahveci and Singh, IEEE Trans Knowl Data Eng 16(4):418-433, 2004) to address this problem, which uses compression as an additional step to reduce the index size. In this paper, we propose an alternative technique, called compact MRI (CMRI), which uses adaptive piecewise constant approximation (APCA) representation as dimensionality reduction technique, and which occupies much less space without requiring compression. We implemented both MRI and CMRI, and conducted extensive experiments to evaluate and compare their performance on real stock data as well as synthetic. Our results indicate that CMRI provides a much better precision ranging from 0.75 to 0.89 on real data, and from 0.80 to 0.95 on synthetic data, while for MRI, these ranges are from 0.16 to 0.34, and from 0.03 to 0.65, respectively. Compared to sequential scan, we found that CMRI is 4-30 times faster and the number of disk I/Os it required is close to minimal. In terms of storage utilization, CMRI occupies 1% of the memory occupied by MRI. These results and analysis show CMRI to be an efficient and scalable indexing technique for large time series databases.
机译:我们研究了在时间序列数据中搜索相似模式以进行可变长度查询的问题。最近,在(Kahveci和Singh,国际数据工程会议论文集,第273-282页,2001年; Kahveci和Singh,IEEE Trans Knowl数据工程16(4)中,提出了一种多分辨率索引技术(MRI)。 :418-433,2004)解决此问题,该方法使用压缩作为减少索引大小的附加步骤。在本文中,我们提出了另一种技术,称为紧凑型MRI(CMRI),该技术使用自适应分段常数逼近(APCA)表示作为降维技术,并且无需压缩即可占用更少的空间。我们同时实施了MRI和CMRI,并进行了广泛的实验,以评估和比较它们在实际库存数据以及合成数据上的性能。我们的结果表明,CMRI在真实数据上的精度范围从0.75到0.89,在合成数据上的精度从0.80到0.95,而MRI的精度分别在0.16到0.34和0.03到0.65。与顺序扫描相比,我们发现CMRI的速度快4到30倍,并且所需的磁盘I / O数量几乎很少。在存储利用率方面,CMRI占MRI占用的内存的1%。这些结果和分析表明,对于大型时间序列数据库,CMRI是一种有效且可扩展的索引技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号