...
首页> 外文期刊>Journal of Integrative Bioinformatics >The LAILAPS Search Engine: A Feature Model for Relevance Ranking in Life Science Databases
【24h】

The LAILAPS Search Engine: A Feature Model for Relevance Ranking in Life Science Databases

机译:LAILAPS搜索引擎:生命科学数据库中相关性排名的功能模型

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de
机译:生命科学领域中有效和有效的信息检索是生物信息学中最紧迫的挑战之一。生命科学数据库到相互连接的信息系统的庞大网络中的惊人增长,在同样程度上是生命科学研究的巨大挑战和巨大机会。在网络中,特别是在生命科学数据库中找到的知识是宝贵的主要资源。为了将其带到科学家桌面,拥有性能良好的搜索引擎至关重要。因此,响应时间和结果数量都不重要。数百万个查询结果中最关键的因素是相关性排名。在本文中,我们提出了生命科学数据库中相关性排名的特征模型及其在LAILAPS搜索引擎中的实现。基于对用户在检查搜索引擎结果过程中的行为的观察,我们总结出了9个相关性区分功能。这些功能由科学家直观地使用,他们会简短地筛选数据库条目以了解潜在的相关性。这些功能既足以估计潜在的相关性,又可以有效地量化。从该特征计算相关性的相关性预测函数的推导构成了回归问题。为了解决此问题,我们使用了经过人工神经网络训练的人工神经网络,其中包含19个蛋白质查询的相关数据库条目的参考集。该概念支持灵活的文本索引和简单的数据导入格式,并在LAILAPS搜索引擎中实现。它可以轻松地用作全面集成生命科学数据库和小型内部项目数据库的搜索引擎。 LAILAPS可从http://lailaps.ipk-gatersleben.de公开获取SWISSPROT数据

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号