Predictive data grouping using successor prediction.

机译：使用后继预测进行预测数据分组。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Latency is an ever-increasing component of data access costs, which in turn are often the bottleneck for modern high performance systems. The ability to predict future data accesses is essential to any attempt at addressing this problem, and we present a novel model for gathering and utilizing data access predictions. Prior attempts to utilize access predictions have taken the form of a single predictive engine attempting to preemptively fetch data. We offer a more powerful model that separates the process of access prediction from the data retrieval mechanism. Predictions are made on a per-file basis and used to provide a minimal amount of additional metadata, which in turn is used by a grouping mechanism to automatically associate related items. This approach allows truly opportunistic utilization of predictive information, with little of the timing restrictions of prior approaches. Our research covers access prediction, grouping based on predictions, and a discussion of predictability and its meaning in the context of I/O behavior.; We present two predictors: Noah, named for its prediction of pairs, and Recent Popularity, a majority voting mechanism. We distinguish the goal of predicting the most events accurately (general accuracy) from the goal of offering the most accurate predictions (specific accuracy). Both predictors can trade the number of events predicted for accuracy. Trace-based evaluation demonstrates that their error rates can be adjusted to less than 2% for more than 60% of all access requests. Predictions are used to provide a minimal amount of per-file additional metadata, which is then used separately by our grouping mechanism.; To demonstrate the usefulness of grouping, we present the aggregating cache which manages distributed file system caches based upon groups built from our successor predictions. We present trace-driven results demonstrating that grouping can reduce LRU demand fetches by 50% to 60%. If we consider the effects of intervening caches we observe dramatic gains for our predictive cache. Our treatment includes information theoretic results that justify our approach, a graphical explanation of the effects of caches on workload predictability (cache-frequency plots), as well as relative predictor performance (rank-difference plots).

机译：延迟是数据访问成本中日益增加的组成部分，而后者又反过来常常成为现代高性能系统的瓶颈。预测未来数据访问的能力对于解决该问题的任何尝试都是必不可少的，并且我们提出了一种新颖的模型来收集和利用数据访问预测。先前利用访问预测的尝试已经采取了单个预测引擎试图抢先获取数据的形式。我们提供了更强大的模型，该模型将访问预测的过程与数据检索机制区分开来。预测基于每个文件进行，并用于提供最少数量的其他元数据，分组机制依次使用这些元数据来自动关联相关项目。这种方法允许真正机会主义地利用预测信息，而现有方法几乎没有时间限制。我们的研究涵盖访问预测，基于预测的分组以及对I / O行为上下文中的可预测性及其含义的讨论。我们提供了两个预测变量：以预测配对而命名的诺亚（Noah）和多数表决机制最近流行度（Recent Popularity）。我们将提供最准确的预测（特定准确性）的目标与准确地预测最多事件的目标（一般准确性）区分开来。两个预测器都可以交换预测的事件数以确保准确性。基于跟踪的评估表明，对于超过60％的所有访问请求，其错误率可以调整为小于2％。预测用于提供最少的每文件附加元数据，然后由我们的分组机制单独使用。为了证明分组的有用性，我们介绍了 aggregation cache ，它根据从我们的后继预测构建的组来管理分布式文件系统缓存。我们提供了跟踪驱动的结果，表明分组可以将LRU需求获取减少50％至60％。如果考虑中间缓存的影响，我们会发现预测性缓存取得了巨大的进步。我们的处理包括证明我们的方法合理的信息理论结果，关于缓存对工作负载可预测性的影响的图形说明（缓存频率图）以及相对预测器性能（秩差图））。

著录项

作者
Amer, Ahmed M.;
展开▼
作者单位

University of California, Santa Cruz.;

展开▼
授予单位 University of California, Santa Cruz.;
学科 Computer Science.
学位 Ph.D.
年度 2002
页码 138 p.
总页数 138
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Predicting functional outcome in acute stroke: comparison of a simple six variable model with other predictive systems and informal clinical prediction. [J] . Counsell C, Dennis M, McDowall M Journal of Neurology, Neurosurgery and Psychiatry . 2004,第3期

机译：预测急性卒中的功能结局：将简单的六变量模型与其他预测系统和非正式临床预测进行比较。
2. Predicting the future of plant breeding: complementing empirical evaluation with genetic prediction. [J] . Cooper M., Messina C. D., Podlich D., Crop & Pasture Science . 2014,第4期

机译：预测植物育种的未来：将经验评估与遗传预测相结合。
3. Predictive ability of machine learning methods for massive crop yield prediction. [J] . Gonzalez-Sanchez A., Frausto-Solis J., Ojeda-Bustamante W. Spanish Journal of Agricultural Research . 2014,第2期

机译：机器学习方法对大量作物产量预测的预测能力。
4. An algorithm based on Google Trends' data for future prediction. Case study: German elections [C] . Polykalas Spyros E., Prezerakos George N., Konidaris Agisilaos IEEE International Symposium on Signal Processing and Information Technology . 2013

机译：一种基于Google趋势数据的算法，可用于将来的预测。案例研究：德国大选
5. Predicting behavior with an artificial neural network: A comparison with linear models of prediction. [D] . Maucieri, Lawrence P., Jr. 2003

机译：用人工神经网络预测行为：与线性预测模型的比较。
6. OOTFD (Object-Oriented Transcription Factors Database): an object-oriented successor to TFD. [O] . D Ghosh 1998

机译：OOTFD（面向对象的转录因子数据库）：TFD的面向对象的后继者。
7. Summary of the outcomes of the consultation on detailed changes to the Key Information Set data collection for 2017 and approaches to presenting data on the successor to the Unistats website [O] . 2016

机译：关于2017年关键信息集数据收集的详细更改的磋商结果摘要以及在Unistats网站上显示后续数据的方法

Predictive data grouping using successor prediction.

摘要

著录项

相似文献

相关主题

期刊订阅