Structural mining of large-scale behavioral data from the Internet.

机译：来自Internet的大规模行为数据的结构化挖掘。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

As the Internet becomes ever more pervasive in the lives of hundreds of millions of people, our understanding of its physical structure has outpaced our understanding of the dynamic patterns of traffic generated by its users. This work aims to develop a better understanding of the structure of Internet traffic in a manner consistent with individual privacy and computational constraints. I first examine network flow data from the Internet2 network, using it to form "behavioral networks" based on the flows attributable to specific network applications. The heavy-tailed distributions associated with these networks suggest unbounded variance and poorly defined means in distributions of user behavior. However, a novel application of hierarchical clustering to similarity data derived from these networks makes it possible to classify network applications robustly based on their observed behavior. I then focus on Web traffic, using a large collection of HTTP request data to build a weighted subset of the Web graph. Analysis of this weighted graph reveals more heavy-tailed distributions and the presence of a large body of stationary traffic. The traffic data are also shown to contradict key assumptions of the random surfer model used by PageRank. I conclude with the development of ABC, an behaviorally plausible agent-based model of Web traffic that incorporates backtracking, bookmarks, and a sense of topical locality. The ABC model is shown to approximate real user activity more accurately than PageRank on both artificial and empirically generated graphs.

机译：随着Internet在成千上万人的生活中变得越来越普遍，我们对它的物理结构的理解已经超过了对它的用户产生的动态流量模式的理解。这项工作旨在以与个人隐私和计算约束一致的方式更好地理解Internet流量的结构。我首先检查来自Internet2网络的网络流数据，并使用该数据基于可归因于特定网络应用程序的流来形成“行为网络”。与这些网络相关的繁重分布表明用户行为分布无限制的方差和定义不明确的均值。但是，将分层聚类应用于从这些网络派生的相似性数据的新颖应用程序使得可以根据网络应用程序的观察行为对其进行健壮分类。然后，我将重点放在Web流量上，它使用大量HTTP请求数据来构建Web图的加权子集。对这个加权图的分析显示出更多的重尾分布，并且存在大量的固定交通。流量数据还显示出与PageRank使用的随机冲浪者模型的关键假设相矛盾。我以ABC的开发作为结束，ABC是一种行为可行的基于代理的Web流量模型，其中包含回溯，书签和主题局部性的感觉。在人工图和凭经验生成的图上，ABC模型都显示出比PageRank更准确地近似实际用户活动。

著录项

作者
Meiss, Mark.;
展开▼
作者单位

Indiana University.;

展开▼
授予单位 Indiana University.;
学科 Web Studies.;Computer Science.;Artificial Intelligence.
学位 Ph.D.
年度 2010
页码 321 p.
总页数 321
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Large-scale data analysis on aviation accident database using different data mining techniques [J] . Christopher A. B. Arockia, Vivekanandam V. Shunmughavel, Anderson A. B. Antony, The Aeronautical Journal . 2016,第1234期

机译：使用不同数据挖掘技术对航空事故数据库进行大规模数据分析
2. Combining Continuous Smartphone Native Sensors Data Capture and Unsupervised Data Mining Techniques for Behavioral Changes Detection: A Case Series of the Evidence-Based Behavior (eB2) Study [J] . Sofian Berrouiguet, David Ramírez, María Luisa Barrigón, JMIR mHealth and uHealth . 2018,第12期

机译：结合连续智能手机本机传感器数据捕获和无监督数据挖掘技术进行行为变化检测：基于证据的行为（eB2）研究的案例系列
3. Development of an Electronic Data Collection System to Support a Large-Scale HIV Behavioral Intervention Trial: Protocol for an Electronic Data Collection System [J] . W Scott Comulada, Wenze Tang, Dallas Swendeman, JMIR Research Protocols . 2018,第12期

机译：开发电子数据收集系统以支持大规模的HIV行为干预试验：电子数据收集系统的协议
4. Big Data Mining or Turning Data Mining into Predictive Analytics from Large-Scale 3Vs Data: The Future Challenge for Knowledge Discovery [C] . Alfredo Cuzzocrea International conference on model and data engineering . 2014

机译：大数据挖掘或将数据挖掘从大型3V数据转变为预测分析：知识发现的未来挑战
5. New data mining and marketing approaches for customer segmentation and promotion planning on the Internet. [D] . Yang, Yinghui. 2004

机译：用于Internet上的客户细分和促销计划的新数据挖掘和营销方法。
6. Large-Scale Data Mining of Rapid Residue Detection Assay Data From HTML and PDF Documents: Improving Data Access and Visualization for Veterinarians [O] . Majid Jaberi-Douraki, Soudabeh Taghian Dinani, Nuwan Indika Millagaha Gedara, 2021

机译：来自HTML和PDF文件的快速残留检测测定数据的大规模数据挖掘：改善兽医的数据访问和可视化
7. Functional Integration with Process Mining and Process Analyzing for Structural and Behavioral Properness Validation of Discovered Processes from Event Log Datasets [O] . Kwanghoon Kim 2020

机译：与过程挖掘的功能集成和分析事件日志数据集发现过程的结构和行为适用性验证
8. Enhancements for a Dynamic Data Warehousing and Mining System for Large-Scale Human Social Cultural Behavioral (HSBC) Data. [R] . Savas, O. 2016

机译：针对大规模人类社会文化行为（HsBC）数据的动态数据仓库和挖掘系统的增强功能。

Structural mining of large-scale behavioral data from the Internet.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅