针对频繁导出式子树的特点,给出一种基于编码的频繁导出式子树挖掘算法.该算法通过宽度优先编码来表示原始数据库,使单个投影的规模最小;通过对每个投影编码降低了整个投影库的规模,从而有效地提高了频繁导出式子树的挖掘效率.实验结果验证了该算法具有较高的挖掘效率.%According to the characteristics of frequent induced sub-tree,a mining algorithm based on encoding,called EFITM algorithm, is presented.Width-first encoding is used to express the initial database,which minimizes the encoding size of every single projection in the project database.The intervals with encoding are used to denote the project database of the node on the right-most path of the subtree,and the size of the whole project database is decreased.Experimental results show the correctness and the validity of the EFITM algorithm.
展开▼