首页> 外文会议>Advanced Information Management and Service (ICIPM), 2011 7th International Conference on >Automatic metadata extraction and classification of spreadsheet documents based on layout similarity
【24h】

Automatic metadata extraction and classification of spreadsheet documents based on layout similarity

机译:基于布局相似性的电子表格文档自动元数据提取和分类

获取原文
获取原文并翻译 | 示例

摘要

Effective information search is becoming a key success for business. Metadata is an essential part of modern information system since it helps people to find relevant documents from disparate repositories. Automatic document metadata extraction has received attention in recent years as it is an important task in generating powerful search indices to support effective information search. The objective of this paper is to propose an innovative method that automatically performs metadata extraction and classification on the spreadsheets having layout similar to that of a given sample spreadsheet whose metadata is previously defined. Metadata classification is based on document types (e.g. purchase order, sales report etc) and data context (e.g. customer name, order date etc) so that users can define the meanings of the keywords in the search query. Therefore, search engine of this work returns the search results that match user search intention more than those of conventional keyword search engines.
机译:有效的信息搜索正在成为企业的关键成功。元数据是现代信息系统的重要组成部分,因为它可以帮助人们从不同的存储库中找到相关的文档。近年来,自动文档元数据提取已受到关注,因为它是生成强大的搜索索引以支持有效信息搜索的重要任务。本文的目的是提出一种创新的方法,该方法可以自动对具有与先前定义了元数据的示例电子表格进行布局的电子表格进行元数据提取和分类。元数据分类基于文档类型(例如采购订单,销售报告等)和数据上下文(例如客户名称,订单日期等),以便用户可以定义搜索查询中关键字的含义。因此,这项工作的搜索引擎比传统的关键字搜索引擎返回的搜索结果更符合用户搜索意图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号