Information visualization for document classification

机译：信息可视化，用于文档分类

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This project seeks to combine state-of-the-art information visualization techniques with text image Cannon Quality Factors to characterize and discriminate among text documents and their digital images. It will provide a highly effective tool for characterization and management of a test corpus composed of over 1200 documents. The basic concept is that once characterized, it should be possible to visually identify regions of expected OCR accuracy and degree of OCR difficulty within the OCR Test Corpus using the Cannon Quality Factors. We have been working with an information visualization tool (dubbed "Parentage") to identify the appropriate metric data for the above purposes. Two very important potential applications of this work include the capability to (1) identify new research directions for OCR development, and (2) identify the most appropriate OCR commercial/system engine to use with a given set of documents.

机译：该项目旨在将最先进的信息可视化技术与文本图像Cannon质量因子相结合，以表征和区分文本文档及其数字图像。它将提供一个高效的工具来表征和管理由1200多个文档组成的测试语料库。基本概念是，一旦确定了特征，就应该有可能使用Cannon质量因子在OCR测试语料库中目视识别预期的OCR准确性和OCR难度程度的区域。我们一直在使用信息可视化工具（称为“父母身份”）来识别用于上述目的的适当指标数据。这项工作的两个非常重要的潜在应用包括：（1）识别OCR开发的新研究方向，以及（2）识别最适合用于给定文档集的OCR商业/系统引擎的能力。

著录项

来源
《1999 National Conference of the American Society for Engineering Management October 21-23, 1999 Virginia Beach, VA》|1999年|p.57-60|共4页
会议地点 Virginia Beach VA(US)
作者
Theresa Jefferson; Richard Scotti;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类一般工业技术;
关键词
入库时间 2022-08-26 14:11:04

相似文献

外文文献
中文文献
专利

1. Color and document classification in ancient China:The classification-centered functions of color in document [J] . Ya ZHOU1 中国图书馆学报年刊(英文版) . 2014,第001期

机译：中国古代的颜色与文件分类：文件中颜色的分类中心功能
2. Document Cards: A Top Trumps Visualization for Documents [J] . Strobelt Hendrik, Oelke Daniela, Rohrdantz Christian, Visualization and Computer Graphics, IEEE Transactions on . 2009,第6期

机译：文件卡：顶级特朗普文件可视化
3. Visualizing an Ethics Framework: A Method to Create Interactive Knowledge Visualizations From Health Policy Documents [J] . Joanna Sleigh, Manuel Schneider, Julia Amann, Journal of medical Internet research . 2020,第1期

机译：可视化道德框架：一种从健康策略文件创建交互式知识可视化的方法
4. Creating and Visualizing Fuzzy Document Classification [C] . Judith Gelernter, Dong Cao, Raymond Lu, IEEE International Conference on Systems, Man and Cybernetics . 2009

机译：创建和可视化模糊文档分类
5. Visualization of search engine query result using region-based document model on XML documents. [D] . Parikh, Sunish Umesh. 2000

机译：在XML文档上使用基于区域的文档模型来可视化搜索引擎查询结果。
6. Large-Scale Data Mining of Rapid Residue Detection Assay Data From HTML and PDF Documents: Improving Data Access and Visualization for Veterinarians [O] . Majid Jaberi-Douraki, Soudabeh Taghian Dinani, Nuwan Indika Millagaha Gedara, 2021

机译：来自HTML和PDF文件的快速残留检测测定数据的大规模数据挖掘：改善兽医的数据访问和可视化
7. Creating and Visualizing Fuzzy Document Classification [O] . Gelernter, Judith, Cao, Dong, Lu, Raymond, 2009

机译：创建和可视化模糊文档分类

Information visualization for document classification

摘要

著录项

相似文献

相关主题

期刊订阅