Significant Pattern Mining: Efficient Algorithms and Biomedical Applications

机译：重要的模式挖掘：高效算法和生物医学应用

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Pattern mining techniques such as itemset mining, sequence mining and graph mining have been applied to a wide range of datasets. To convince biomedical researchers, however, it is necessary to show statistical significance of obtained patterns to prove that the patterns are not likely to emerge from random data. The key concept of significance testing is family-wise error rate, i.e., the probability of at least one pattern is falsely discovered under null hypotheses. In the worst case, FWER grows linearly to the number of all possible patterns. We show that, in reality, FWER grows much slower than the worst case, and it is possible to find significant patterns in biomedical data. The following two properties are exploited to accurately bound FWER and compute small p-value correction factors. (1) Only closed patterns need to be counted. (2) Patterns of low support can be ignored, where the support threshold depends on the Tarone bound. We introduce efficient depth-first search algorithms for discovering all significant patterns and discuss about parallel implementations.

机译：模式挖掘技术（例如项集挖掘，序列挖掘和图形挖掘）已应用于广泛的数据集。然而，要说服生物医学研究人员，有必要证明所获得模式的统计意义，以证明该模式不太可能从随机数据中出现。重要性检验的关键概念是针对家庭的错误率，即在原假设下错误地发现至少一种模式的可能性。在最坏的情况下，FWER线性增长到所有可能模式的数量。我们表明，实际上，FWER的增长速度比最坏的情况要慢得多，并且有可能在生物医学数据中找到重要的模式。利用以下两个属性来精确绑定FWER并计算小的p值校正因子。（1）仅需要计算闭合模式。（2）低支撑模式可以忽略，其中支撑阈值取决于Tarone界限。我们介绍了用于发现所有重要模式的高效深度优先搜索算法，并讨论了并行实现。

著录项

来源
《International symposium on string processing and information retrieval;Workshop on compression, text, and algorithms》|2016年|qt14-qt14|共1页
会议地点
作者
Koji Tsuda;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. An efficient XML query pattern mining algorithm for ebXML applications in e-commerce [J] . Tsui-Ping Chang African Journal of Business Management . 2014,第18期

机译：电子商务中ebXML应用程序的有效XML查询模式挖掘算法
2. An efficient algorithm of frequent XML query pattern mining for ebXML applications in e-commerce [J] . Tsui-Ping Chang, Shih-Ying Chen Expert Systems with Application . 2012,第2期

机译：电子商务中ebXML应用程序的频繁XML查询模式挖掘的有效算法
3. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications [J] . Yiyan Zhang, Yi Xin, Qin Li, BioMedical Engineering OnLine . 2017,第1期

机译：七种数据挖掘算法对生物医学分类应用数据集不同特征的实证研究
4. Two Efficient Algorithms for Mining High Utility Sequential Patterns [C] . Chuankai Zhang, Yiwen Zu, Junli Nie, IEEE Intl Conf on Parallel Distributed Processing with Applications;IEEE Intl Conf on Social Computing Networking;IEEE Intl Conf on Big Data Cloud Computing;IEEE Intl Conf on Sustainable Computing Communications . 2019

机译：挖掘高效序模式的两种高效算法
5. Algorithmic Approaches for Determining Spatial Patterns in Several Biomedical Applications [D] . Chen, Zihe 2019

机译：确定几种生物医学应用中空间格局的算法方法
6. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications [O] . Yiyan Zhang, Yi Xin, Qin Li, 2017

机译：七种数据挖掘算法在生物医学分类应用中不同数据集特征的实证研究
7. An efficient XML query pattern mining algorithm for ebXML applications in e-commerce [O] . Chang Tsui-Ping 2014

机译：电子商务中EBXML应用的高效XML查询模式挖掘算法

Significant Pattern Mining: Efficient Algorithms and Biomedical Applications

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅