首页> 外文会议>ACM Conference on Human Factors in Computing Systems >Interactive Repair of Tables Extracted from PDF Documents on Mobile Devices
【24h】

Interactive Repair of Tables Extracted from PDF Documents on Mobile Devices

机译:从移动设备上的PDF文档提取的表的交互式维修

获取原文

摘要

PDF documents often contain rich data tables that offer opportunities for dynamic reuse in new interactive applications. We describe a pipeline for extracting, analyzing, and parsing PDF tables based on existing machine learning and rule-based techniques. Implementing and deploying this pipeline on a corpus of 447 documents with 1,171 tables results in only 11 tables that are correctly extracted and parsed. To improve the results of automatic table analysis, we first present a taxonomy of errors that arise in the analysis pipeline and discuss the implications of cascading errors on the user experience. We then contribute a system with two sets of lightweight interaction techniques (gesture and toolbar), for viewing and repairing extraction errors in PDF tables on mobile devices. In an evaluation with 17 users involving both a phone and a tablet, participants effectively repaired common errors in 10 tables, with an average time of about 2 minutes per table.
机译:PDF文档通常包含丰富的数据表,为新的交互式应用程序提供了动态重用的机会。 我们描述了一种用于基于现有机器学习和基于规则的技术来提取,分析和解析PDF表的管道。 在447个文档的语料库上实现和部署该管道,其中1,171个表只能在正确提取和解析的11个表中产生。 为了提高自动表分析的结果,首先提出分析管道中出现的错误的分类,并讨论级联误差对用户体验的影响。 然后,我们为具有两组轻量级交互技术(手势和工具栏)的系统提供了一种系统,用于在移动设备上的PDF表中查看和修复提取错误。 在具有17个用户涉及电话和平板电脑的评估中,参与者在10个表中有效修复了常见的误差,每个表的平均时间约为2分钟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号