首页> 外文会议>IAPR International Conference on Document Analysis and Recognition >Compact and Efficient WFST-Based Decoders for Handwriting Recognition
【24h】

Compact and Efficient WFST-Based Decoders for Handwriting Recognition

机译:基于WFST的紧凑高效的解码器,用于手写识别

获取原文

摘要

We present two weighted finite-state transducer (WFST) based decoders for handwriting recognition. One decoder is a cloud-based solution that is both compact and efficient. The other is a device-based solution that has a small memory footprint. A compact WFST data structure is proposed for the cloud-based decoder. There are no output labels stored on transitions of the compact WFST. A decoder based on the compact WFST data structure produces the same result with significantly less footprint compared with a decoder based on the corresponding standard WFST. For the device-based decoder, on-the-fly language model rescoring is performed to reduce footprint. Careful engineering methods, such as WFST weight quantization, token and data type refinement, are also explored. When using a language model containing 600,000 n-grams, the cloud-based decoder achieves an average decoding time of 4.04 ms per text line with a peak footprint of 114.4 MB, while the device-based decoder achieves an average decoding time of 13.47 ms per text line with a peak footprint of 31.6 MB.
机译:我们介绍了两个基于手写体识别的加权有限状态换能器(WFST)解码器。一个解码器是一种基于云的解决方案,既紧凑又高效。另一个是基于设备的解决方案,具有较小的内存占用量。提出了一种紧凑的WFST数据结构,用于基于云的解码器。紧凑型WFST的过渡上没有存储输出标签。与基于相应标准WFST的解码器相比,基于紧凑型WFST数据结构的解码器可产生相同的结果,且占用空间显着减少。对于基于设备的解码器,执行实时语言模型记录以减少占用空间。还探索了谨慎的工程方法,例如WFST权重量化,令牌和数据类型细化。当使用包含600,000 n-grams的语言模型时,基于云的解码器实现每文本行4.04 ms的平均解码时间,峰值占用空间为114.4 MB,而基于设备的解码器实现每文本行的平均解码时间13.47 ms文本行,最大占用空间为31.6 MB。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号