首页> 外文期刊>Pattern recognition letters >An annotation assistance system using an unsupervised codebook composed of handwritten graphical multi-stroke symbols
【24h】

An annotation assistance system using an unsupervised codebook composed of handwritten graphical multi-stroke symbols

机译:使用由手写图形化多笔画符号组成的无监督码本的注释辅助系统

获取原文
获取原文并翻译 | 示例

摘要

Many present recognition systems take advantage of ground-truthed datasets for training, evaluating and testing. But the creation of ground-truthed datasets is a tedious task. This paper proposes an iterative unsupervised handwritten graphical symbols learning framework which can be used for assisting such a labeling task. Initializing each stroke as a segment, we construct a relational graph between the segments where the nodes are the segments and the edges are the spatial relations between them. To extract the relevant patterns, a quantization of segments and spatial relations is implemented. Discovering graphical symbols becomes then the problem of finding the sub-graphs according to the Minimum Description Length (MDL) principle. The discovered graphical symbols will become the new segments for the next iteration. In each iteration, the quantization of segments yields the codebook in which the user can label graphical symbols. This original method has been first applied on a dataset of simple mathematical expressions. The results reported in this work show that only 58.2% of the strokes have to be manually labeled.
机译:当前的许多识别系统利用地面真实数据集进行训练,评估和测试。但是创建真实数据集是一项繁琐的任务。本文提出了一种可迭代的无监督手写图形符号学习框架,该框架可用于辅助此类标记任务。将每个笔划初始化为一个片段,我们在片段之间构造一个关系图,其中节点是片段,边缘是它们之间的空间关系。为了提取相关模式,对片段和空间关系进行量化。然后,发现图形符号成为根据最小描述长度(MDL)原理查找子图的问题。发现的图形符号将成为下一次迭代的新部分。在每次迭代中,段的量化产生了码本,用户可以在其中标记图形符号。此原始方法首先应用于简单数学表达式的数据集。这项工作报告的结果表明,只有58.2%的笔划需要手动标记。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号