【24h】

Scatteract: Automated Extraction of Data from Scatter Plots

机译:散点图:自动从散点图中提取数据

获取原文

摘要

Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical values of data points from images of scatter plots. We use deep learning techniques to identify the key components of the chart, and optical character recognition together with robust regression to map from pixels to the coordinate system of the chart. We focus on scatter plots with linear scales, which already have several interesting challenges. Previous work has done fully automatic extraction for other types of charts, but to our knowledge this is the first approach that is fully automatic for scatter plots. Our method performs well, achieving successful data extraction on 89% of the plots in our test set.
机译:图表是传达数据模式和趋势的绝佳方法,但它们不利于进一步对数据建模或对单个数据点进行仔细检查。我们提出了一种全自动系统,用于从散点图图像中提取数据点的数值。我们使用深度学习技术来识别图表的关键组成部分,并使用光学字符识别以及强大的回归功能来从像素映射到图表的坐标系。我们专注于线性比例的散点图,这些散点图已经面临一些有趣的挑战。先前的工作已经完成了对其他类型图表的全自动提取,但是据我们所知,这是对散点图全自动的第一种方法。我们的方法性能良好,可以成功提取测试集中89%的数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号