首页> 外国专利> Computer Implemented Method for Discovery of Markov Boundaries from Datasets with Hidden Variables

Computer Implemented Method for Discovery of Markov Boundaries from Datasets with Hidden Variables

机译：从具有隐藏变量的数据集中发现马尔可夫边界的计算机实现方法

页面导航

摘要
著录项
相似文献

摘要

Methods for Markov boundary discovery are important recent developments in pattern recognition and applied statistics, primarily because they offer a principled solution to the variable/feature selection problem and give insight about local causal structure. Currently there exist two major local method families for identification of Markov boundaries from data: methods that directly implement the definition of the Markov boundary and newer compositional Markov boundary methods that are more sample efficient and thus often more accurate in practical applications. However, in the datasets with hidden (i.e., unmeasured or unobserved) variables compositional Markov boundary methods may miss some Markov boundary members. The present invention circumvents this limitation of the compositional Markov boundary methods and proposes a new method that can discover Markov boundaries from the datasets with hidden variables and do so in a much more sample efficient manner than methods that directly implement the definition of the Markov boundary. In general, the inventive method transforms a dataset with many variables into a minimal reduced dataset where all variables are needed for optimal prediction of some response variable. The power of the invention was empirically demonstrated with data generated by Bayesian networks and with 13 real datasets from a diversity of application domains.

机译：马尔可夫边界发现方法是模式识别和应用统计领域的重要最新进展，主要是因为它们为变量/特征选择问题提供了有原则的解决方案，并提供了关于局部因果结构的见识。当前，存在两个主要的本地方法族，用于从数据中识别马尔可夫边界：直接实现马尔可夫边界定义的方法和更新的成分马尔可夫边界方法，这些方法在样本采样中效率更高，因此在实际应用中通常更准确。但是，在具有隐藏变量（即未测量或未观察到）的数据集中，成分马尔可夫边界方法可能会遗漏某些马尔可夫边界成员。本发明克服了组合马尔可夫边界方法的这种局限性，并提出了一种新方法，该方法可以从具有隐藏变量的数据集中发现马尔可夫边界，并且该方法比直接实现马尔可夫边界的定义的采样效率更高。通常，本发明的方法将具有许多变量的数据集转换成最小化的简化数据集，其中所有变量对于某个响应变量的最佳预测都是必需的。贝叶斯网络生成的数据以及来自各种应用领域的13个真实数据集以经验方式证明了本发明的力量。

著录项

公开/公告号US2011202322A1

专利类型
公开/公告日2011-08-18

原文格式PDF
申请/专利权人 ALEXANDER STATNIKOV;KONSTANTINOS (CONSTANTIN) F. ALIFERIS;
展开▼

申请/专利号US20100689944
发明设计人 ALEXANDER STATNIKOV;KONSTANTINOS (CONSTANTIN) F. ALIFERIS;
展开▼

申请日2010-01-19
分类号G06F17/10;
国家 US
入库时间 2022-08-21 18:15:35

相似文献

专利
外文文献
中文文献