Interactive Repair of Tables Extracted from PDF Documents on Mobile Devices

机译：从移动设备上的PDF文档提取的表的交互式维修

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

PDF documents often contain rich data tables that offer opportunities for dynamic reuse in new interactive applications. We describe a pipeline for extracting, analyzing, and parsing PDF tables based on existing machine learning and rule-based techniques. Implementing and deploying this pipeline on a corpus of 447 documents with 1,171 tables results in only 11 tables that are correctly extracted and parsed. To improve the results of automatic table analysis, we first present a taxonomy of errors that arise in the analysis pipeline and discuss the implications of cascading errors on the user experience. We then contribute a system with two sets of lightweight interaction techniques (gesture and toolbar), for viewing and repairing extraction errors in PDF tables on mobile devices. In an evaluation with 17 users involving both a phone and a tablet, participants effectively repaired common errors in 10 tables, with an average time of about 2 minutes per table.

机译：PDF文档通常包含丰富的数据表，为新的交互式应用程序提供了动态重用的机会。我们描述了一种用于基于现有机器学习和基于规则的技术来提取，分析和解析PDF表的管道。在447个文档的语料库上实现和部署该管道，其中1,171个表只能在正确提取和解析的11个表中产生。为了提高自动表分析的结果，首先提出分析管道中出现的错误的分类，并讨论级联误差对用户体验的影响。然后，我们为具有两组轻量级交互技术（手势和工具栏）的系统提供了一种系统，用于在移动设备上的PDF表中查看和修复提取错误。在具有17个用户涉及电话和平板电脑的评估中，参与者在10个表中有效修复了常见的误差，每个表的平均时间约为2分钟。

著录项

来源
《ACM Conference on Human Factors in Computing Systems》|2019年|1(CD-ROM)|共13页
会议地点
作者
Jane Hoffswell; Zhicheng Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP30-53;
关键词
PDF; Data tables; Table classification; Error taxonomy; Error correction; Mobile devices; Interaction techniques;

机译：PDF;数据表;表分类;误差分类;纠错;移动设备;交互技术;
入库时间 2022-08-21 09:35:46

相似文献

外文文献
中文文献
专利

1. TEXUS: A unified framework for extracting and understanding tables in PDF documents [J] . Rastan Roya, Paik Hye-Young, Shepherd John Information Processing & Management . 2019,第3期

机译：TEXUS：提取和理解PDF文档中表格的统一框架
2. Intelligent document-filling system on mobile devices by document classification and electronization [J] . Lu Jing, Wu Shihong, Xiang Zhiyu, Computational Intelligence . 2020,第4期

机译：通过文档分类和电子化在移动设备上智能文档填充系统
3. On methods and tools of table detection, extraction and annotation in PDF documents [J] . Shah Khusro, Asima Latif, Irfan Ullah Journal of Information Science . 2015,第1期

机译：PDF文档中表格检测，提取和注释的方法和工具
4. Interactive Repair of Tables Extracted from PDF Documents on Mobile Devices [C] . Jane Hoffswell, Zhicheng Liu ACM Conference on Human Factors in Computing Systems . 2019

机译：从移动设备上的PDF文档提取的表的交互式维修
5. Interactive In-Situ Scene Capture on Mobile Devices [D] . Sankar, Aditya. 2017

机译：移动设备上的交互式原位场景捕获
6. Embedding and Publishing Interactive 3-Dimensional Scientific Figures in Portable Document Format (PDF) Files [O] . David G. Barnes, Michail Vidiassov, Bernhard Ruthensteiner, -1

机译：以便携式文档格式（PDF）文件嵌入和发布交互式三维科学图形
7. PDF-TREX: An Approach for Recognizing and Extracting Tables from PDF Documents [O] . Ermelinda Oro, Massimo Ruffolo 2009

机译：PDF-TREX：一种从PDF文档中识别和提取表格的方法

Interactive Repair of Tables Extracted from PDF Documents on Mobile Devices

摘要

著录项

相似文献

相关主题

期刊订阅