首页> 外文OA文献 >The LAILAPS Search Engine: A Feature Model for Relevance Ranking in Life Science Databases
【2h】

The LAILAPS Search Engine: A Feature Model for Relevance Ranking in Life Science Databases

机译:LAILAPS搜索引擎:生命科学数据库中相关性排名的功能模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de
机译:生命科学领域中有效和有效的信息检索是生物信息学中最紧迫的挑战之一。生命科学数据库到相互连接的信息系统的庞大网络中的惊人增长,同样在一定程度上是生命科学研究的巨大挑战和巨大机遇。在网络中,特别是在生命科学数据库中找到的知识是宝贵的主要资源。为了将其带到科学家桌面,拥有性能良好的搜索引擎至关重要。因此,响应时间和结果数量都不重要。数百万个查询结果中最关键的因素是相关性排名。在本文中,我们提出了一种用于生命科学数据库中相关性排名的特征模型及其在LAILAPS搜索引擎中的实现。出于对用户在检查搜索引擎结果期间的行为进行观察的动机,我们浓缩了9个相关性区分功能。这些功能由科学家直观地使用,他们简短地筛选了数据库条目以寻找潜在的相关性。这些功能既足以估计潜在的相关性,又可以有效地量化。从该特征计算相关性的相关性预测函数的推导构成了回归问题。为了解决这个问题,我们使用了人工神经网络,该人工神经网络已经针对19个蛋白质查询的一组相关数据库条目进行了参考训练。该概念支持灵活的文本索引和简单的数据导入格式,并在LAILAPS搜索引擎中实现。它可以轻松地用作全面的集成生命科学数据库和小型内部项目数据库的搜索引擎。 LAILAPS可通过http://lailaps.ipk-gatersleben.de公开获取SWISSPROT数据

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号