首页> 外国专利> EFFICIENT GLOBALLY OPTIMAL INTERPRETATION OF DOCUMENTS

EFFICIENT GLOBALLY OPTIMAL INTERPRETATION OF DOCUMENTS

机译:高效的全球最优文件解释

摘要

A method is provided for parsing a document having a plurality of lines on which items are listed spanning one or more lines. It includes: obtaining a plurality of candidates, representing hypothetical items within the document, each candidate spanning one or more lines and having a local cost representing a confidence in a quality of the candidate compared to a model; determining labeling costs for intervals of the document defined between pairs of lines, each interval containing candidates therein, each labeling cost reflecting a configuration of the candidates within the interval; identifying a best labeling for each interval based on the labeling costs determined for that interval, the best labeling corresponding to one of the configurations of the candidates within the interval; defining a global objective function; and selecting a subset of the candidates such that the global objective function is optimized, based on the identified best labelings.
机译:提供了一种用于解析具有多行的文档的方法,在该多行上列出了跨越一个或多个行的项目。它包括:获得多个候选者,这些候选者代表文档中的假设项目,每个候选者跨越一条或多条线,并且具有与模型相比,该局部成本代表对该候选者的质量的置信度;确定在行对之间定义的文档间隔的标记成本,每个间隔中包含候选,每个标记成本反映该间隔内候选的配置;基于针对该间隔确定的标签成本,为每个间隔标识最佳标签,该最佳标签对应于该间隔内的候选配置之一;定义全局目标函数;以及根据识别出的最佳标签选择候选子集,以便优化全局目标函数。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号