首页> 外国专利> Computer-aided studying of probabilistic network from data record of measured, experimentally determined and/or empirical values, comprises studying structure of non-directed graphs having nodes and non-directed edges from the data record

Computer-aided studying of probabilistic network from data record of measured, experimentally determined and/or empirical values, comprises studying structure of non-directed graphs having nodes and non-directed edges from the data record

机译:根据测得的,实验确定的和/或经验值的数据记录对概率网络进行计算机辅助研究,包括研究具有节点和来自数据记录的非定向边的无向图的结构

摘要

The method for computer-aided studying of a probabilistic network from data record of measured, experimentally determined and/or empirical values, comprises studying the structure of non-directed graphs having nodes (1, 2, 3, 4, 5, 6, 7, 8, 9) and non-directed edges between the nodes from the data record, producing non-directed sub-graphs from the non-directed graphs for each variable, and studying the structure and parameter of directed sub-graphs with nodes and directed edges between the nodes and/or the structure and parameter of sub-graphs of probabilistic graphic models. The method for computer-aided studying of a probabilistic network from data record of measured, experimentally determined and/or empirical values, comprises studying the structure of non-directed graphs having nodes (1, 2, 3, 4, 5, 6, 7, 8, 9) and non-directed edges between the nodes from the data record, producing non-directed sub-graphs from the non-directed graphs for each variable, and studying the structure and parameter of directed sub-graphs with nodes and directed edges between the nodes and/or the structure and parameter of sub-graphs of probabilistic graphic models with nodes and edges between the nodes from each non-directed sub-graph independent of the other non-directed sub-graphs. The probabilistic network comprises directed graph structure with nodes and directed edges between the nodes. The nodes represent variables of the data record and the directed edges dependencies between the variables. The dependencies are described by parameter of probability distributions. The non-directed sub-graphs comprise nodes and non-directed edges between the nodes in the environment of the respective variables. The respective directed sub-graph is learned, so that the directed sub-graph contains only nodes, which are present in the corresponding non-directed sub-graphs as nodes, and the directed sub-graph contains only directed edges, which are present in the corresponding non-directed sub-graphs as non-directed edges. For studying the structure of the non-directed graphs, a test-based learning process such as a statistical independence test and/or personal computer algorithm and/or three-phase dependency analysis algorithm is used. The test-based learning process is developed, so that variables conditional dependence of the respective variables are added to a candidate record of variables, which fulfill a given heuristic function, and variables, which are subset of variables of the candidate records giving conditional independence of the respective variables, are removed from the candidate record. The heuristic function is fixed, so that the variable is added to the next candidate record, which maximizes the smallest conditional dependence of the respective variable tested for all possible subsets at variables of the candidate record. The directed edges are produced between the respective variables and the variables of the candidate record after adding and removing the variables for the respective variable. A score-based learning process is used for learning the structure and parameter of the respective directed sub-graphs. An evaluation after the respective directed sub-graphs is searched in the score-based learning process under consideration. The score-based learning process uses greedy-algorithm after the respective directed sub-graphs for searching. A local structure is fixed within the non-directed graphs for the respective variable. The local structure as nodes comprises the respective variable, the neighbors of the respective variables and if necessary neighbor of higher degrees and the non-directed edges. The local structure of the non-directed sub-graphs represents the respective variables. After learning the respective directed sub-graph, the nodes are removed from the directed sub-graphs, which not belong to the Markov-blanket. After removing the nodes not belonging to the Markov-blanket, a feature partial directed graph is produced, by which the probabilities are determined from the directed sub-graphs for each occurring edge, in which the direction edges are directed. The edges are non-directionally arranged and/or actually no edge is present. A Bayesian network is learned. The data record comprises biological, medical and/or biomedical data such as gene expression samples, occurrence of diseases, clinical data, life-habits of patients and/or pre-existing diseases of patients. The data record comprises data from an automation system, a power generation system and/or a communication network. Independent claims are included for: (1) a method for computer-aided simulation of data based on a probabilistic network; and (2) computer program product with a program code stored on a machine-readable carrier.
机译:从测量,实验确定和/或经验值的数据记录中对概率网络进行计算机辅助研究的方法包括研究具有节点(1、2、3、4、5、6、7的无向图的结构) ,8、9)和数据记录中节点之间的无向边,从每个变量的无向图生成无向子图,并研究具有节点和有向的有向子图的结构和参数概率图形模型的节点之间的边缘和/或子图的结构和参数。从测量,实验确定和/或经验值的数据记录中对概率网络进行计算机辅助研究的方法包括研究具有节点(1、2、3、4、5、6、7的无向图的结构) ,8、9)和数据记录中节点之间的无向边,从每个变量的无向图生成无向子图,并研究具有节点和有向的有向子图的结构和参数来自每个无向子图的节点之间的边缘和/或概率图形模型的子图的结构和参数,以及与其他无向子图无关的每个非有向子图的节点之间的边缘。概率网络包括具有节点的有向图结构以及节点之间的有向边。节点表示数据记录的变量以及变量之间的有向边依存关系。依存关系由概率分布的参数描述。无向子图包括节点和各个变量环境中节点之间的无向边。学习相应的有向子图,以便有向子图仅包含节点,这些节点在相应的非有向子图中作为节点存在,有向子图仅包含有向边,在节点中存在相应的无向子图为无向边。为了研究无向图的结构,使用了基于测试的学习过程,例如统计独立性测试和/或个人计算机算法和/或三相相关性分析算法。开发了基于测试的学习过程,以便将各个变量的条件条件相关性变量添加到满足给定启发式功能的变量候选记录中,并将变量作为候选记录变量的子集给出条件独立性。各个变量将从候选记录中删除。启发式函数是固定的,因此该变量将添加到下一个候选记录中,从而最大程度提高了候选记录中所有可能子集所测试的各个变量的最小条件相关性。在添加和删除相应变量的变量之后,在相应变量和候选记录的变量之间产生有向边。基于分数的学习过程用于学习各个有向子图的结构和参数。在考虑的基于分数的学习过程中搜索各个有向子图之后的评估。基于分数的学习过程在各个有向子图之后使用贪婪算法进行搜索。局部结构固定在各个变量的无向图中。作为节点的局部结构包括各自的变量,各自的变量的邻居以及必要时更高程度的邻居和非定向边缘。无向子图的局部结构表示各个变量。在学习了相应的有向子图之后,从不属于马尔可夫毯的有向子图上删除节点。在除去不属于马尔可夫毯的节点之后,生成特征部分有向图,通过该特征有向图,可以从有向子图确定每个出现的边的概率,其中方向边是有向的。边缘无方向地布置和/或实际上不存在边缘。贝叶斯网络被学习。数据记录包括生物学,医学和/或生物医学数据,例如基因表达样品,疾病的发生,临床数据,患者的生活习惯和/或患者先前存在的疾病。数据记录包括来自自动化系统,发电系统和/或通信网络的数据。包括以下方面的独立权利要求:(1)一种基于概率网络的计算机辅助数据模拟方法; (2)具有存储在机器可读载体上的程序代码的计算机程序产品。

著录项

  • 公开/公告号DE102007044380A1

    专利类型

  • 公开/公告日2009-03-19

    原文格式PDF

  • 申请/专利权人 SIEMENS AG;

    申请/专利号DE20071044380

  • 申请日2007-09-17

  • 分类号G06N7/00;C12Q1/00;

  • 国家 DE

  • 入库时间 2022-08-21 19:09:33

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号