首页> 外文会议>2010 Proceedings IEEE INFOCOM >Link Homophily in the Application Layer and its Usage in Traffic Classification
【24h】

Link Homophily in the Application Layer and its Usage in Traffic Classification

机译:应用层的链路同质性及其在流分类中的应用

获取原文

摘要

We address the following questions. Is there link homophily in the application layer traffic? If so, can it be used to accurately classify traffic in network trace data without relying on payloads or properties at the flow level? Our research shows that the answers to both of these questions are affirmative in real network trace data. Specifically, we define link homophily to be the tendency for flows with common IP hosts to have the same application (P2P, Web, etc.) compared to randomly selected flows. The presence of link homophily in trace data provides us with statistical dependencies between flows that share common IP hosts. We utilize these dependencies to classify application layer traffic without relying on payloads or properties at the flow level. In particular, we introduce a new statistical relational learning algorithm, called Neighboring Link Classifier with Relaxation Labeling (NLC+RL). Our algorithm has no training phase and does not require features to be constructed. All that it needs to start the classification process is traffic information on a small portion of the initial flows, which we refer to as seeds. In all our traces, NLC+RL achieves above 90% accuracy with less than 5% seed size; it is robust to errors in the seeds and various seed-selection biases; and it is able to accurately classify challenging traffic such as P2P with over 90% Precision and Recall.
机译:我们解决以下问题。应用层流量中是否存在同质链接?如果是这样,是否可以将其用于准确地对网络跟踪数据中的流量进行分类,而不依赖于流级别的有效负载或属性?我们的研究表明,这两个问题的答案在真实的网络跟踪数据中都是肯定的。具体来说,我们将链接同质性定义为与随机选择的流相比,具有普通IP主机的流具有相同应用程序(P2P,Web等)的趋势。跟踪数据中链路同质性的存在为我们提供了共享公共IP主机的流之间的统计依赖性。我们利用这些依赖性对应用程序层流量进行分类,而无需在流级别上依赖有效负载或属性。特别是,我们引入了一种新的统计关系学习算法,称为带有松弛标签的邻居链接分类器(NLC + RL)。我们的算法没有训练阶段,不需要构造特征。开始分类过程所需的全部只是一小部分初始流量的流量信息,我们将其称为种子。在我们的所有追踪中,NLC + RL的种子大小均小于5%,可达到90%以上的精度;它对于种子中的错误和各种种子选择偏差具有鲁棒性;并且能够以90%以上的“精确度”和“召回率”准确地对具有挑战性的流量(例如P2P)进行分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号