首页> 外文会议>International Conference on Document Analysis and Recognition >Bayesian Network Structure Learning and Inference Methods for Handwriting
【24h】

Bayesian Network Structure Learning and Inference Methods for Handwriting

机译:贝叶斯网络结构学习和手写推理方法

获取原文

摘要

Probabilistic models of characteristics of handwritten words are useful in forensic document examination since they can be used to answer queries such as: determine the rarity of a given style of writing of the word, find the probability of observing those characteristics in a representative database of given size, etc. The task considered here is to use a training set of samples of a word written by a representative population of individuals (with each individual's writing of the word being described by a fixed set of discrete categorical variables), to construct directed probabilistic graphical models (Bayesian networks or BNs) and then use such models to answer probabilistic queries. However, since the BN structure learning problem is NP-hard, we propose an approximate method and analyze its performance and complexity. The proposed algorithm uses a local measure of deviance from independence (chi-squared tests between pairs of variables) and a global score (log-loss). The method builds the BN structure incrementally, by adding directed edges with high deviance and choosing the edge direction to minimize log-loss. The method is evaluated with samples of the word and obtained from a representative population of the United States with descriptive characteristic sets that are different for cursive writing and for hand-printing. For several samples obtained from the BN, the probability of random correspondence (PRC) is inferred. A measure of the discriminatory power of the characteristic set (conditional PRC) is also determined. The computational complexity of determining the probability of finding a similar one to a given sample, within a tolerance, in a database of given size, is discussed.
机译:手写单词特征的概率模型对于法医文献检查有用,因为它们可用于回答查询,例如:确定给定的文字的稀有性,找到观察给定的代表数据库中的这些特征的概率这里考虑的任务是使用由个人代表人口写的单词的训练集(每个人的写作由固定的离散分类变量描述),构建有针对性的概率图形模型(贝叶斯网络或BNS),然后使用这些模型来回答概率查询。然而,由于BN结构学习问题是NP - 硬,我们提出了一种近似的方法并分析其性能和复杂性。该算法使用来自独立性的本地偏差(在变量对之间的Chi平方测试)和全局分数(对数丢失)。该方法通过添加具有高偏差的引导边缘并选择边沿方向以最小化对数损耗来逐渐地构建BN结构。该方法被单词的样本评估,并从美国的代表性人群获得,具有描述性特征集,这些特征集不同于法学写入和手动印刷。对于从BN获得的几个样本,随机对应(PRC)的概率被推断出来。还确定了特征集(条件PRC)的鉴别力的衡量标准。讨论了确定在给定尺寸的数据库中找到与给定样本的相似概率的计算复杂性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号