在比特流未知协议识别过程中,针对如何将得到的多协议数据帧分为单协议数据帧这一问题,提出了一种改进的凝聚型层次聚类算法.该算法以传统的凝聚型层次聚类算法思想为基础,结合比特流数据帧的特征,定义了数据帧之间及类簇之间的相似度,采用边聚类边提取符合要求类簇的方式,能快速有效地对数据帧进行聚类;并且该算法能自动地确定聚类的个数,所得的类簇含有相似度评价指标.利用林肯实验室公布的数据集进行测试,说明该算法能以较高的正确率对协议数据帧进行聚类.%In the process of bit-stream unknown protocol identification,how to separate multi-protocol data frames into single protocol data frames is a challenging issue.To solve this problem,we propose an improved algorithm based on the idea of traditional AGNES algorithm.Combining the features of bitstream data frames,this algorithm can define the similarity between data frames and the similarity between clusters by two different ways.We perform clustering and extract clusters that meet the requirements of the class cluster simultaneously.Protocol data frames can be clustered quickly and effectively without inputting the number of clusters.And a similarity evaluation is included in the results of class clusters.Tests on the data set published by the Lincoln Laboratory show that the algorithm has a higher accuracy rate for clustering protocol data frames.
展开▼