首页> 外文会议>International Conference on Frontiers in Handwriting Recognition >A Bayesian Approach to Script Independent Multilingual Keyword Spotting
【24h】

A Bayesian Approach to Script Independent Multilingual Keyword Spotting

机译:贝叶斯方法的脚本无关多语言关键字发现

获取原文

摘要

We propose a script independent bayesian framework for keyword spotting in multilingual handwritten documents. The approach relies on local character level score and global word level hypothesis scores and learns a bayesian logistic regression classifier to distinguish between keywords and non-keywords. In a bayesian formulation of logistic regression, the integral over weights becomes intractable. Variational approximation is used for inference. In order to learn a robust classifier with minimal number of samples, we apply bayesian active learning framework to request labels for those word images which provide maximum information gain in improving the classifier. We evaluate our system on multilingual datasets, publicly available IAM dataset for English, AMA for Arabic and LAW dataset for Devanagiri. The system is also evaluated on a synthetic multilingual dataset prepared by combining samples from IAM, AMA and LAW datasets. The results are comparable with the state of art multilingual keyword spotting framework.
机译:我们提出了一种与脚本无关的贝叶斯框架,用于在多语言手写文档中发现关键字。该方法依赖于局部字符级别得分和整体单词级别假设得分,并学习贝叶斯逻辑回归分类器以区分关键词和非关键词。在贝叶斯逻辑回归公式中,权重上的积分变得棘手。变分近似用于推断。为了使用最少的样本数学习鲁棒的分类器,我们应用贝叶斯主动学习框架来请求这些词图像的标签,这些词图像可在改进分类器中提供最大的信息增益。我们在多语言数据集,英语的IAM数据集,阿拉伯语的AMA和Devanagiri的LAW数据集上评估我们的系统。该系统还在合成的多语言数据集上进行评估,该数据集是通过组合来自IAM,AMA和LAW数据集的样本而准备的。结果与先进的多语言关键字发现框架可比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号