由于网络动态数据在不断影响着企业动态竞争环境的形成,因此需要进行动态数据环境下网络重复数据检测方法的研究.但是采用当前方法进行重复数据检测时,无法详细的计算出各数据属性的最终取值种类数,存在重复数据检测精度低的问题.为此,提出一种基于综合加权法的动态数据环境下网络重复数据检测方法.上述方法先利用均值法计算出网络中每个数据属性的最终统一等级,得到数据属性的主观等级向量,给出数据字符串间的编辑距离,对其距离相似度进行计算,融合ISNM方法得到字符关键字,并进行窗口内相邻数据比较,依据比较的结果完成对动态数据环境下网络重复数据检测.实验结果表明,所提方法数据检测精度较高,可以有效地满足对动态数据环境下网络重复数据检测的应用需求.%In this paper,we propose a method for detection of network duplicate data under dynamic data environment,using comprehensive weighted method.Method of mean value was used to work out uniform grade of each data attribute in network,and subjective grade vector of the data attribute was obtained.Then,editing distance among data character string was provided,and its distance similarity was calculated.Integrated with ISNM method,character keyword was obtained,and adjacent data in window was compared.According to comparative result,the detection of network duplicate data was completed.The conclusions can be drawn from experimental simulation that the method has high precision of data detection and can satisfy application requirement of the detection of network duplicate data.
展开▼