首页> 外文会议>IEEE International Conference on Computer and Communications >Keyword sequence extraction based on byte entropy iterative segmentation
【24h】

Keyword sequence extraction based on byte entropy iterative segmentation

机译:基于字节熵迭代分割的关键词序列提取

获取原文
获取外文期刊封面目录资料

摘要

For the problem of mixed unknown message classification, variable field in message causes difficulty in keyword sequence extraction and message analysis. In this paper, we propose a message classification and keyword sequence extraction method based on byte entropy iterative segmentation (BEIS). In BEIS, firstly, we divide the messages into coarse clusters and extract the first keyword from each cluster. Secondly, we find the rest keywords in an iterative way. For each turn of a keyword extraction, we define links (two adjacent bytes) to find the keyword's offset in messages according to the frequency of links, and estimate the length of the keyword via byte entropy trend, the rest part excluding the extracted keyword being the object for next iterative extraction. Experimental results show that compared with relative existing algorithms, BEIS is more accurate and stable.
机译:对于未知消息混合分类的问题,消息中的可变字段导致关键字序列提取和消息分析困难。本文提出了一种基于字节熵迭代分段(BEIS)的消息分类和关键词序列提取方法。在BEIS中,首先,我们将消息划分为粗略的簇,然后从每个簇中提取第一个关键字。其次,我们以迭代方式找到其余关键字。对于关键字提取的每一轮,我们定义链接(两个相邻字节)以根据链接的频率在消息中查找关键字的偏移量,并通过字节熵趋势估计关键字的长度,其余部分不包括提取的关键字为下一个迭代提取的对象。实验结果表明,与现有的算法相比,BEIS更加准确,稳定。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号