首页> 外文期刊>Journal of Visual Languages & Computing >Chart decoder: Generating textual and numeric information from chart images automatically
【24h】

Chart decoder: Generating textual and numeric information from chart images automatically

机译:图表解码器:自动从图表图像生成文本和数字信息

获取原文
获取原文并翻译 | 示例

摘要

Charts are commonly used as a graphical representation for visualizing numerical data in digital documents. For many legacy charts or scientific charts, however, underlying data is not available, which hinders the process of redesigning more effective visualizations and further analysis of charts. In response, we present Chart Decoder, a system that implements decoding of visual features and recovers data from chart images. Chart Decoder takes a chart image as input and generates the textual and numeric information of that chart image as output through applying deep learning, computer vision and text recognition techniques. We train a deep learning based classifier to identify chart types of five categories (bar chart, pie chart, line chart, scatter plot and radar chart), which achieves a classification accuracy over 99%. We also complement a textual information extraction pipeline which detects text regions in a chart, recognizes text content and distinguishes their roles. For generating textual and graphical information, we implement automated data recovery from bar charts, one of the most popular chart types. To evaluate the effectiveness of our algorithms, we evaluate our system on two corpora: 1) bar charts collected from the web, 2) charts randomly made by a script. The results demonstrate that our system is able to recover data from bar charts with a high rate of accuracy.
机译:图表通常用作可视化数字文档中数字数据的图形表示。但是,对于许多传统图表或科学图表而言,基础数据不可用,这阻碍了重新设计更有效的可视化效果以及对图表进行进一步分析的过程。作为回应,我们提出了图表解码器,该系统可实现视觉特征的解码并从图表图像中恢复数据。图表解码器通过应用深度学习,计算机视觉和文本识别技术,将图表图像作为输入,并生成该图表图像的文本和数字信息作为输出。我们训练基于深度学习的分类器来识别五个类别的图表类型(条形图,饼图,折线图,散点图和雷达图),分类精度达到99%以上。我们还补充了文本信息提取管道,该管道可检测图表中的文本区域,识别文本内容并区分其作用。为了生成文本和图形信息,我们从条形图(最受欢迎的一种图表类型)中实现了自动数据恢复。为了评估算法的有效性,我们使用两种语料对系统进行评估:1)从网络收集的条形图; 2)由脚本随机绘制的图。结果表明,我们的系统能够从条形图中高精度地恢复数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号