首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >A parallel computing approach to creating engineering concept spaces for semantic retrieval: the Illinois Digital Library Initiative project
【24h】

A parallel computing approach to creating engineering concept spaces for semantic retrieval: the Illinois Digital Library Initiative project

机译:创建用于语义检索的工程概念空间的并行计算方法:伊利诺伊州数字图书馆倡议项目

获取原文
获取原文并翻译 | 示例

摘要

This research presents preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation techniques, to which we refer to as the concept space approach, we aimed to create graphs of domain-specific concepts (terms) and their weighted co-occurrence relationships for all major engineering domains. Merging these concept spaces and providing traversal paths across different concept spaces could potentially help alleviate the vocabulary (difference) problem evident in large-scale information retrieval. In order to address the scalability issue related to large-scale information retrieval and analysis for the current Illinois DLI project, we conducted experiments using the concept space approach on parallel supercomputers. Our test collection included computer science and electrical engineering abstracts extracted from the INSPEC database. The concept space approach called for extensive textual and statistical analysis (a form of knowledge discovery) based on automatic indexing and co-occurrence analysis algorithms, both previously tested in the biology domain. Initial testing results using a 512-node CM-5 and a 16-processor SGI Power Challenge were promising.
机译:这项研究提出了从伊利诺伊州数字图书馆倡议(DLI)项目的语义检索研究部分中产生的初步结果。我们使用自动同义词库生成技术的一种变体(我们称为概念空间方法),旨在为所有主要工程领域创建特定领域概念(术语)及其加权共现关系的图表。合并这些概念空间并提供跨不同概念空间的遍历路径可以潜在地缓解大规模信息检索中明显的词汇(差异)问题。为了解决当前伊利诺伊州DLI项目与大规模信息检索和分析相关的可伸缩性问题,我们在并行超级计算机上使用概念空间方法进行了实验。我们的测试集合包括从INSPEC数据库中提取的计算机科学和电气工程摘要。概念空间方法要求基于自动索引和共现分析算法进行广泛的文本和统计分析(一种知识发现形式),这两种算法先前都已在生物学领域进行过测试。使用512节点CM-5和16处理器SGI Power Challenge的初步测试结果令人鼓舞。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号