首页> 美国政府科技报告 >Full-Text Access to Historical Newspapers
【24h】

Full-Text Access to Historical Newspapers

机译:全文访问历史报纸

获取原文

摘要

Newspapers are rich records of U. S. history. Due to the deterioration of older newspapers, the National Endowment for the Humanities is archiving 19th century newspapers on microfilm. Although microfilm is a good preservation method, it provides limited access to researchers and the general public. We are building a system to provide universal access to digital images and full-text content of historical newspapers. The system has three main components: (a) an Optical Character Recognition (OCR) module that converts digitized images into searchable text and identifies regions, (b) an Information Retrieval module that applies linguistic information to aid in segmentation, indexing, and retrieval of the noisy OCR'd text, and (c) a User Interface module that allows historians and educators to query and view retrieved documents. Thus far, we have developed two OCR techniques targeted to processing historical newspapers and we have built a user interface to search the OCR output and superimpose matches on a page image from the newspaper.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号