Combining Visual and Textual Features for Information Extraction from Online Flyers

机译：与在线传单中的信息提取相结合的视觉和文本功能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information in visually rich formats such as PDF and HTML is often conveyed by a combination of textual and visual features. In particular, genres such as marketing flyers and info-graphics often augment textual information by its color, size, positioning, etc. As a result, traditional text-based approaches to information extraction (IE) could underperform. In this study, we present a supervised machine learning approach to IE from online commercial real estate flyers. We evaluated the performance of SVM classifiers on the task of identifying 12 types of named entities using a combination of textual and visual features. Results show that the addition of visual features such as color, size, and positioning significantly increased classifier performance.

机译：视觉丰富的格式（如PDF和HTML）的信息通常由文本和视觉功能的组合传达。特别是，诸如营销传单和信息图形的流派经常通过其颜色，大小，定位等增强文本信息，结果，传统的基于文本的信息提取方法（即）可能低于表现。在这项研究中，我们向IE提供了一个来自在线商业房地产传单的监督机器学习方法。我们使用文本和可视功能的组合鉴定了SVM分类器对识别12种命名实体的任务的性能。结果表明，添加视觉功能，如颜色，尺寸和定位显着提高了分类器性能。

著录项

来源
《Conference on empirical methods in natural language processing》|2014年||共6页
会议地点
作者
Emilia Apostolova; Noriko Tomuro;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. An architecture of open-source tools to combine textual information extraction, faceted search and information visualisation [J] . Sonntag Daniel, Profitlich Hans-Juergen Artificial intelligence in medicine . 2019,第JANa期

机译：结合文本信息提取，分面搜索和信息可视化的开源工具架构
2. Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme [J] . Chitrakala Gopalan, D. Manjula Signal, Image and Video Processing . 2011,第2期

机译：使用组合特征方案从异构文本图像中检测，定位和提取文本的统计模型
3. Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme - Springer [J] . Chitrakala Gopalan, D. Manjula Signal, Image and Video Processing . 2011,第2期

机译：使用组合特征方案从异类文本图像中检测，定位和提取文本的统计模型-Springer
4. Combining Visual and Textual Features for Information Extraction from Online Flyers [C] . Emilia Apostolova, Noriko Tomuro Conference on empirical methods in natural language processing . 2014

机译：结合视觉和文字功能，从在线传单中提取信息
5. All Purpose Textual Data Information Extraction, Visualization and Querying [D] . Hashmi, Syed Usama 2018

机译：通用文本数据信息提取，可视化和查询
6. Biomedical Imaging Modality Classification Using Combined Visual Features and Textual Terms [O] . Xian-Hua Han, Yen-Wei Chen 2011

机译：使用结合的视觉特征和文本术语的生物医学成像模式分类
7. Combining Visual and Textual Features for Information Extraction from Online Flyers [O] . Emilia Apostolova, Noriko Tomuro 2015

机译：结合视觉和文本功能从在线传单中提取信息

Combining Visual and Textual Features for Information Extraction from Online Flyers

摘要

著录项

相似文献

相关主题

期刊订阅