首页> 外文会议>International conference on very large data bases;VLDB 2008 >Online Maintenance of Very Large Random Samples on Flash Storage
【24h】

Online Maintenance of Very Large Random Samples on Flash Storage

机译:在线维护闪存中的非常大的随机样本

获取原文

摘要

Recent advances in flash media have made it an attractive alternative for data storage in a wide spectrum of computing devices, such as embedded sensors, mobile phones, PDA's, laptops, and even servers. However, flash media has many unique characteristics that make existing data management/analytics algorithms designed for magnetic disks perform poorly with flash storage. For example, while random (page) reads are as fast as sequential reads, random (page) writes and in-place data updates are orders of magnitude slower than sequential writes. In this paper, we consider an important fundamental problem that would seem to be particularly challenging for flash storage: efficiently maintaining a very large (100 MBs or more) random sample of a data stream (e.g., of sensor readings). First, we show that previous algorithms such as reservoir sampling and geometric file are not readily adapted to flash. Second, we propose B-FILE, an energy-efficient abstraction for flash media to store self-expiring items, and show how a B-FILE can be used to efficiently maintain a large sample in flash. Our solution is simple, has a small (RAM) memory footprint, and is designed to cope with flash constraints in order to reduce latency and energy consumption. Third, we provide techniques to maintain biased samples with a B-FlLE and to query the large sample stored in a B-FlLE for a subsample of an arbitrary size. Finally, we present an evaluation with flash media that shows our techniques are several orders of magnitude faster and more energy-efficient than (flash-friendly versions of) reservoir sampling and geometric file. A key finding of our study, of potential use to many flash algorithms beyond sampling, is that "semi-random" writes (as defined in the paper) on flash cards are over two orders of magnitude faster and more energy-efficient than random writes.
机译:闪存介质的最新进展使它成为各种计算设备(例如嵌入式传感器,移动电话,PDA,笔记本电脑甚至服务器)中数据存储的有吸引力的替代方案。但是,闪存介质具有许多独特的特性,这些特性使现有的为磁盘设计的数据管理/分析算法在闪存存储中的性能较差。例如,尽管随机(页面)读取与顺序读取一样快,但是随机(页面)写入和就地数据更新比顺序写入要慢几个数量级。在本文中,我们考虑了一个重要的基本问题,这对于闪存来说似乎特别具有挑战性:有效维护数据流(例如传感器读数)的很大(100 MB或更多)随机样本。首先,我们证明了以前的算法(例如储层采样和几何文件)不容易适应闪光。其次,我们提出B-FILE,一种用于闪存介质的节能抽象,用于存储自到期项,并说明如何使用B-FILE高效地维护闪存中的大样本。我们的解决方案很简单,具有较小的(RAM)内存占用空间,旨在应对闪存限制,以减少延迟和能耗。第三,我们提供了使用B-FlLE维护有偏差的样本并向B-FlLE中存储的大样本查询任意大小的子样本的技术。最后,我们对闪存介质进行了评估,结果表明,与储层采样和几何文件(闪存的友好版本)相比,我们的技术速度提高了几个数量级,并且能源效率更高。我们这项研究的一个重要发现是,除了采样之外,它还可以用于许多闪存算法,这是闪存卡上的“半随机”写入(如本文所定义)比随机写入要快两个数量级,并且能效更高。 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号