A study of the applicability of recommender systems for the Production and Distributed Analysis system PanDA of the ATLAS Experiment

M Titov; G Záruba; K De; A Klimentov; S Jha; ATLAS Collaboration

首页> 外文期刊>Journal of Physics: Conference Series >A study of the applicability of recommender systems for the Production and Distributed Analysis system PanDA of the ATLAS Experiment

【24h】

A study of the applicability of recommender systems for the Production and Distributed Analysis system PanDA of the ATLAS Experiment

机译：推荐系统在ATLAS实验的生产和分布式分析系统PanDA中的适用性研究

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Scientific computing has advanced in the ways it deals with massive amounts of data, since the production capacities have increased significantly for the last decades. Most large science experiments require vast computing and data storage resources in order to provide results or predictions based on the data obtained. For scientific distributed computing systems with hundreds of petabytes of data and thousands of users it is important to keep track not just of how data is distributed in the system, but also of individual users' interests in the distributed data (reveal implicit interconnection between user and data objects). This however requires the collection and use of specific statistics such as correlations between data distribution, the mechanics of data distribution, and mainly user preferences. This work focuses on user activities (specifically, data usages) and interests in such a distributed computing system, namely PanDA (Production ANd Distributed Analysis system). PanDA is a high-performance workload management system originally designed to meet production and analysis requirements for a data-driven workload at the Large Hadron Collider Computing Grid for the ATLAS Experiment hosted at CERN (the European Organization for Nuclear Research). In this work we are going to investigate whether data collection that was gathered in the past in PanDA shows any trends indicating that users could have mutual interests that would be kept for the next data usages (i.e., data usage patterns), using data mining techniques such as association analysis, sequential pattern mining, and basics of the recommender system approach. We will show that such common interests between users indeed exist and thus could be used to provide recommendations (in terms of the collaborative filtering) to help users with their data selection process.

机译：科学计算在处理海量数据方面已经取得了进步，因为在过去的几十年中，生产能力已大大提高。大多数大型科学实验需要大量的计算和数据存储资源，才能根据获得的数据提供结果或预测。对于具有数百PB数据和数千用户的科学分布式计算系统，重要的是不仅要跟踪数据在系统中的分布方式，而且要跟踪单个用户对分布式数据的兴趣（用户与用户之间的隐式互连）数据对象）。但是，这需要收集和使用特定的统计信息，例如数据分布之间的相关性，数据分布的机制以及主要是用户偏好。这项工作的重点是用户活动（特别是数据使用情况）和对这种分布式计算系统（即PanDA（生产和分布式分析系统））的兴趣。 PanDA是一种高性能的工作负载管理系统，最初旨在满足大型强子对撞机计算网格的数据驱动工作负载的生产和分析要求，该网格用于CERN（欧洲核研究组织）主办的ATLAS实验。在这项工作中，我们将调查过去使用PanDA收集的数据收集是否显示任何趋势，表明使用数据挖掘技术，用户可能会为下一次数据使用（例如，数据使用模式）保持共同的利益。例如关联分析，顺序模式挖掘以及推荐系统方法的基础知识。我们将证明用户之间确实存在这种共同兴趣，因此可以用来提供建议（就协作过滤而言），以帮助用户进行数据选择过程。

著录项

来源
《Journal of Physics: Conference Series 》 |2018年第4期| 共页
作者
M Titov; G Záruba; K De; A Klimentov; S Jha; ATLAS Collaboration;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类物理学 ;
关键词

相似文献

外文文献
中文文献
专利

1. Distributed Analysis Jobs With the Atlas Production System [J] . Gonzlez S., Liko D., Nairz A., IEEE Transactions on Nuclear Science . 2006 ,第期

机译：使用Atlas生产系统进行分布式分析工作
2. Analysis of the Applicability of Singlemode Optical Fibers for Measurement of Deformation with Distributed Systems BOTDR [J] . Advances in Electrical and Electronic Engineering . 2016 ,第4期

机译：单模光纤在分布式系统BOTDR变形测量中的适用性分析
3. Analysis of the applicability of the integral equation method in the theory of transient electroanalytical experiments for homogeneous reaction–diffusion systems: The case of planar electrodes [J] . Lesaw K. Bieniasz Journal of Electroanalytical Chemistry: An International Journal Devoted to All Aspects of Electrode Kinetics, Interfacial Structure, Properties of Electrolytes, Colloid and Biological Electrochemistry . 2011 ,第1a2期

机译：均相反应扩散系统瞬态电分析实验理论中积分方程法的适用性分析：平面电极的情况
4. Applicability of Sequence Analysis Methods in Analyzing Peer-Production Systems: A Case Study in Wikidata [C] . To Tu Cuong, Claudia Mueller-Birn International conference on social informatics . 2016

机译：序列分析方法在对等生产系统分析中的适用性：以Wikidata为例
5. Panda monitoring---A system to monitor high performance computing for the ATLAS experiment, design, development, implementation and deployment. [D] . Thilagar, Prem A. 2007

机译：熊猫监控-一种用于监控ATLAS实验，设计，开发，实施和部署的高性能计算的系统。
6. Evolution of the ATLAS PanDA Production and Distributed Analysis System [O] . T Maeno, K De, T Wenaus, 2012

机译：地图集熊猫生产和分布式分析系统的演变

A study of the applicability of recommender systems for the Production and Distributed Analysis system PanDA of the ATLAS Experiment

摘要

著录项

相似文献

相关主题

期刊订阅