首页> 外文OA文献 >Text Mining Workflows for Indexing Archives with Automatically Extracted Semantic Metadata
【2h】

Text Mining Workflows for Indexing Archives with Automatically Extracted Semantic Metadata

机译:使用自动提取的语义元数据索引归档的文本挖掘工作流程

摘要

With the vast amounts of textual data that many digital libraries hold, finding information relevant to users has become a challenge. The unstructured and ambiguous nature of natural language in which documents are written, poses a barrier to the accessibility and discovery of information. This can be alleviated by indexing documents with semantic metadata, e.g., by tagging them with terms that could indicate their “aboutness”. As manually indexing these documents is impracticable, automatic tools capable of generating semantic metadata and building search indexes have become attractive solutions. In this tutorial, we demonstrate how digital library developers and managers can use the Argo text mining platform to develop their own customised, modular workflows for automatic semantic metadata generation and search index construction. In this way, we are providing digital library practitioners with the necessary technical know-how on building semantic search indexes without any programming effort, owing to Argo’s graphical interface for workflow construction and execution. We believe that this in turn will allow various digital libraries to build search systems that will enable their users to find and discover information of interest more efficiently and accurately.
机译:随着许多数字图书馆拥有大量文本数据,寻找与用户相关的信息已成为一项挑战。书面文件所使用的自然语言的结构混乱,模棱两可,这对信息的获取和发现构成了障碍。这可以通过用语义元数据索引文档来减轻,例如,通过使用可以指示其“接近性”的术语对其进行标记。由于无法手动为这些文档建立索引,因此能够生成语义元数据和建立搜索索引的自动工具已成为有吸引力的解决方案。在本教程中,我们将演示数字图书馆开发人员和管理人员如何使用Argo文本挖掘平台来开发自己的自定义模块化工作流,以自动生成语义元数据和搜索索引。这样,由于Argo的工作流程构建和执行图形界面,我们为数字图书馆从业者提供了构建语义搜索索引的必要技术知识,而无需任何编程工作。我们相信,这反过来将允许各种数字图书馆构建搜索系统,使他们的用户可以更有效,更准确地查找和发现感兴趣的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号