首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Identifying Relevant Data for a Biological Database: Handcrafted Rules versus Machine Learning
【24h】

Identifying Relevant Data for a Biological Database: Handcrafted Rules versus Machine Learning

机译:识别生物数据库的相关数据:手工规则与机器学习

获取原文
获取原文并翻译 | 示例

摘要

With well over 1,000 specialized biological databases in use today, the task of automatically identifying novel, relevant data for such databases is increasingly important. In this paper, we describe practical machine learning approaches for identifying MEDLINE documents and Swiss-Prot/TrEMBL protein records, for incorporation into a specialized biological database of transport proteins named TCDB. We show that both learning approaches outperform rules created by hand by a human expert. As one of the first case studies involving two different approaches to updating a deployed database, both the methods compared and the results will be of interest to curators of many specialized databases.
机译:如今,在使用超过1,000个专业生物学数据库的情况下,自动识别此类数据库的新颖相关数据的任务变得越来越重要。在本文中,我们描述了用于识别MEDLINE文档和Swiss-Prot / TrEMBL蛋白质记录的实用机器学习方法,并将这些方法结合到称为TCDB的运输蛋白的专门生物学数据库中。我们表明,两种学习方法都优于人类专家手工创建的规则。作为涉及两种不同方法来更新已部署数据库的第一个案例研究,比较的方法和结果对于许多专业数据库的管理者来说都是很有意义的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号