Developing an efficient knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases

Tony Cheng-Kui Huang

首页> 外文期刊>Fuzzy sets and systems >Developing an efficient knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases

【24h】

Developing an efficient knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases

机译：开发有效的知识发现模型以挖掘序列数据库中的模糊多级顺序模式

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Sequential pattern mining from sequence databases has been recognized as an important data mining problem with various applications. Items in a sequence database can be organized into a concept hierarchy according to taxonomy. Based on the hierarchy, sequential patterns can be found not only at the leaf nodes (individual items) of the hierarchy, but also at higher levels of the hierarchy; this is called multiple-level sequential pattern mining. In previous research, taxonomies based on crisp relationships between any two disjointed levels, however, cannot handle the uncertainties and fuzziness in real life. For example, Tomatoes could be classified into the Fruit category, but could be also regarded as the Vegetable category. To deal with the fuzzy nature of taxonomy, Chen and Huang developed a novel knowledge discovering model to mine fuzzy multi-level sequential patterns, where the relationships from one level to another can be represented by a value between 0 and 1. In their work, a generalized sequential patterns (GSP)-like algorithm was developed to find fuzzy multi-level sequential patterns. This algorithm, however, faces a difficult problem since the mining process may have to generate and examine a huge set of combinatorial subsequences and requires multiple scans of the database. In this paper, we propose a new efficient algorithm to mine this type of pattern based on the divide-and-conquer strategy. In addition, another efficient algorithm is developed to discover fuzzy cross-level sequential patterns. Since the proposed algorithm greatly reduces the candidate subsequence generation efforts, the performance is improved significantly. Experiments show that the proposed algorithm is much more efficient and scalable than the previous one. In mining real-life databases, our works enhance the model's practicability and could promote more applications in business.

机译：来自序列数据库的序列模式挖掘已被认为是各种应用程序中的重要数据挖掘问题。序列数据库中的项目可以根据分类法组织成概念层次结构。基于层次结构，不仅可以在层次结构的叶节点（单个项）上找到顺序模式，而且可以在层次结构的更高级别上找到顺序模式。这称为多级顺序模式挖掘。然而，在先前的研究中，基于任何两个脱节水平之间的清晰关系的分类法无法处理现实生活中的不确定性和模糊性。例如，西红柿可以分类为水果分类，但也可以视为蔬菜分类。为了处理分类法的模糊性，Chen和Huang开发了一种新颖的知识发现模型来挖掘模糊的多级顺序模式，其中一个级别到另一个级别的关系可以用0到1之间的值表示。开发了一种类似于通用顺序模式（GSP）的算法来查找模糊多级顺序模式。但是，该算法面临一个难题，因为挖掘过程可能必须生成和检查庞大的组合子序列集，并且需要对数据库进行多次扫描。在本文中，我们提出了一种基于分而治之策略的高效挖掘此类模式的新算法。另外，开发了另一种有效的算法来发现模糊的跨级别顺序模式。由于所提出的算法大大减少了候选子序列生成的工作量，因此性能得到了显着提高。实验表明，该算法比前一种算法具有更高的效率和可扩展性。在挖掘现实数据库中，我们的工作增强了该模型的实用性，并可以促进更多的业务应用。

著录项

来源
《Fuzzy sets and systems》 |2009年第23期|3359-3381|共23页
作者
Tony Cheng-Kui Huang;
展开▼
作者单位

Department of Business Administration, National Chung Cheng University, 168, University Rd., Min-Hsiung, Chia-Yi, Taiwan, Republic of China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
data mining; sequential patterns; multi-level; fuzzy sets; combinatorial problems;

机译：数据挖掘;顺序模式;多层次模糊集组合问题;

相似文献

外文文献
中文文献
专利

1. A novel knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases [J] . Yen-Liang Chen, Tony Cheng-Kui Huang Data & Knowledge Engineering . 2008,第3期

机译：一种用于序列数据库中模糊多级顺序模式挖掘的新型知识发现模型
2. Mining Fuzzy Sequential Patterns with Fuzzy Time-Intervals in Quantitative Sequence Databases [J] . Truong Duc Phuong, Do Van Thanh, Nguyen Duc Dung Cybernetics and information technologies: CIT . 2017,第2期

机译：定量序列数据库中具有模糊时间间隔的模糊序列模式的挖掘
3. A new approach for discovering fuzzy quantitative sequential patterns in sequence databases [J] . Yen-Liang Chen, Tony Cheng-Kui Huang Fuzzy sets and systems . 2006,第12期

机译：在序列数据库中发现模糊定量序列模式的新方法
4. Developing an Efficient Knowledge Discovering Model for Mining Fuzzy Multi-level Sequential Patterns in Sequence Databases [C] . International Conference on New Trends in Information and Service Science . 2009

机译：在序列数据库中开发用于挖掘模糊多级顺序模式的高效知识发现模型
5. Efficient Periodic Pattern Mining in Time Series & Sequence Databases. [D] . Rasheed, Faraz. 2011

机译：时间序列和序列数据库中的高效周期性模式挖掘。
6. Efficient mining gapped sequential patterns for motifs in biological sequences [O] . Vance Chiang-Chi Liao, Ming-Syan Chen 2013

机译：高效挖掘生物序列中基序的缺口序列模式
7. Mining Sequential Patterns More Efficiently by Reducing the Cost of Scanning Sequence Databases [O] . Jiahong Wang, Yoshiaki Asanuma, Eiichiro Kodama, 2006

机译：通过降低扫描序列数据库的成本，更有效地挖掘序列模式
8. Efficient bit string implementation of a database cross-field association system (with an application to protein sequence patterns) [R] . Guigo, R, Vazquez, I, Smith, T F 1992

机译：数据库跨域关联系统的高效位串实现（应用于蛋白质序列模式）

Developing an efficient knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅