首页> 外文会议>ACM international conference on information and knowledge management >Extracting Cross References from Life Science Databases for Search Result Ranking
【24h】

Extracting Cross References from Life Science Databases for Search Result Ranking

机译:从生命科学数据库中提取跨引用搜索结果排名

获取原文

摘要

Scholars in life sciences have to process huge amounts of data in a disciplined and efficient way. These data are spread among thousands of databases which overlap in content but differ substantially with respect to interface, formats and data structure. Search engines have the potential of assisting in data retrieval from these structured sources but fall short of providing a relevance ranking of the results that reflects the needs of life science scholars. One such need is to acquire insights to cross-references among entities in the databases, whereby search hits with many cross-references are expected to be more informative than those with few cross-references. In this work, we investigate to what extend this expectation holds. We propose BioXRef, a method that extracts cross-references from multiple life science databases by combining targeted crawling, pointer chasing, sampling and information extraction. We study the retrieval quality of our method and the relationship between manually crafted relevance ranking and relevance ranking based on cross-references, and report on first, promising results.
机译:生命科学的学者必须以纪律和高效的方式处理大量数据。这些数据在数千个数据库中传播,其在内容中重叠,但基本上相对于接口,格式和数据结构而不同。搜索引擎具有协助这些结构源的数据检索,但下降缺乏提供反映生命科学学者需求的结果的相关性排名。一种这样的需要是在数据库中的实体之间进行交叉引用的见解,从而预期与许多交叉引用的搜索命中率比少数交叉引用的人更有信息。在这项工作中,我们调查了这一期望的延长。我们提出BioxRef,一种通过组合有针对性的爬行,指针追逐,采样和信息提取来提取来自多生命科学数据库的交叉引用的方法。我们研究了我们的方法的检索质量以及基于交叉引用的手动制作相关性排名和相关性排名之间的关系,以及第一次提出的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号