首页> 美国卫生研究院文献>Computational and Mathematical Methods in Medicine >A New Approach for Mining Order-Preserving Submatrices Based on All Common Subsequences
【2h】

A New Approach for Mining Order-Preserving Submatrices Based on All Common Subsequences

机译:基于所有常见子序列的保序子矩阵挖掘新方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Order-preserving submatrices (OPSMs) have been applied in many fields, such as DNA microarray data analysis, automatic recommendation systems, and target marketing systems, as an important unsupervised learning model. Unfortunately, most existing methods are heuristic algorithms which are unable to reveal OPSMs entirely in NP-complete problem. In particular, deep OPSMs, corresponding to long patterns with few supporting sequences, incur explosive computational costs and are completely pruned by most popular methods. In this paper, we propose an exact method to discover all OPSMs based on frequent sequential pattern mining. First, an existing algorithm was adjusted to disclose all common subsequence (ACS) between every two row sequences, and therefore all deep OPSMs will not be missed. Then, an improved data structure for prefix tree was used to store and traverse ACS, and Apriori principle was employed to efficiently mine the frequent sequential pattern. Finally, experiments were implemented on gene and synthetic datasets. Results demonstrated the effectiveness and efficiency of this method.
机译:保单子矩阵(OPSM)已作为许多重要的无监督学习模型应用于许多领域,例如DNA微阵列数据分析,自动推荐系统和目标销售系统。不幸的是,大多数现有方法是启发式算法,无法完全揭示NP完全问题中的OPSM。尤其是,深OPSM(对应于具有很少支持序列的长模式)会导致爆炸性的计算成本,并且被大多数流行方法完全删减。在本文中,我们提出了一种基于频繁顺序模式挖掘的发现所有OPSM的精确方法。首先,调整现有算法以披露每两个行序列之间的所有公共子序列(ACS),因此不会遗漏所有深度OPSM。然后,使用一种改进的前缀树数据结构来存储和遍历ACS,并采用Apriori原理有效地挖掘频繁序列模式。最后,在基因和合成数据集上进行了实验。结果证明了该方法的有效性和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号