首页> 外文会议>Complex, Intelligent and Software Intensive Systems, 2009. CISIS '09 >Improving Data Discovery for Metadata Repositories through Semantic Search
【24h】

Improving Data Discovery for Metadata Repositories through Semantic Search

机译:通过语义搜索改善元数据存储库的数据发现

获取原文

摘要

The amount of ecological data available electronically is increasing at a rapid rate, e.g., over 15,000 data sets are available today in the Knowledge Network for Biocomplexity (KNB) alone. Using the existing search capabilities of these online data repositories, however, scientists struggle to quickly locate data that are relevant to their needs or that will integrate with their current data sets. Semantic technologies aim at addressing many of these problems and hold the promise of enabling more powerful "smart" searches of online data archives. We describe new semantic search features within the Metacat meta-data system, which is used by many ecological research sites around the world for archiving their data using a standardized metadata format. Our semantic search sys-tem adds to Metacat the ability to store OWL-DL ontologies in addition to semantic annotations that link data set attributes to ontology terms. Our approach also extends Metacat to improve metadata search in multiple ways: (i) by expanding standard keyword searches with ontology term hierarchies; (ii) by allowing keyword searches to be applied to annotations in addition to traditional meta-data; and (iii) by allowing more structured searches over annotations via ontology terms. We describe our implementation of these extensions, and compare and contrast these different types of search for a corpus of annotated documents. As data repositories continue to grow, these tools will be instrumental in helping scientists precisely locate and then interpret data for their research needs.
机译:电子可用的生态数据量正在迅速增加,例如,仅在生物复杂性知识网络(KNB)中,今天就有超过15,000个数据集。但是,利用这些在线数据存储库的现有搜索功能,科学家们难以快速找到与他们的需求相关或将与当前数据集集成的数据。语义技术旨在解决许多此类问题,并有望实现更强大的在线数据档案“智能”搜索。我们在Metacat元数据系统中描述了新的语义搜索功能,全球许多生态研究站点都在使用Metacat元数据系统来使用标准化元数据格式来归档其数据。除了将数据集属性链接到本体术语的语义注释外,我们的语义搜索系统还为Metacat添加了存储OWL-DL本体的能力。我们的方法还扩展了Metacat,以多种方式改进元数据搜索:(i)通过使用本体术语层次结构扩展标准关键字搜索; (ii)除了传统的元数据外,还允许将关键字搜索应用于注释; (iii)允许通过本体术语对注释进行更结构化的搜索。我们描述了这些扩展的实现,并比较和对比了这些不同类型的搜索,以查找带注释的文档集。随着数据存储库的不断增长,这些工具将有助于帮助科学家精确定位并解释其研究需求的数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号