首页> 美国政府科技报告 >Dictionary Production for Census Form Conference
【24h】

Dictionary Production for Census Form Conference

机译:人口普查表格会议的字典制作

获取原文

摘要

There are two categories of data from which dictionaries can be produced. One uses old data or data from a previous collection and the other uses new data or data from a current collection. The old data creates dictionaries that can be used for possible answer examples, assisting optical character recognition (OCR) systems, and training of recognition systems. The new data is the most useful in testing and scoring system results. For each of the categories above there are two types of dictionaries. These types may be useful for work with the Second Census OCR Conference. The first contains all words that have occurred in the data set being used. The second dictionary can be built from the essential dictionary. The second dictionary is one which has the misspellings corrected, the abbreviations expanded, and all the words stemmed into logical minimal stems. A mapping between the essential dictionary to the second or exploratory dictionary is required.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号