...
首页> 外文期刊>Expert Systems with Application >KSPF: using gene sequence patterns and data mining for biological knowledge management
【24h】

KSPF: using gene sequence patterns and data mining for biological knowledge management

机译:KSPF:使用基因序列模式和数据挖掘进行生物知识管理

获取原文
获取原文并翻译 | 示例
           

摘要

Most traditional approaches for annotating protein families are not efficient because of high throughput sequences, complex analytic tools and unordered literature and results cannot be reused. Here, we describe a framework, knowledge sharing for protein families (KSPF), that uses sequence pattern data mining and knowledge management to improve upon traditional approaches. It is divided into three modules: automation, retrieval and refinement. This framework builds an environment that allows biological researchers to submit an unknown protein sequence and search for information on its sub-family. Once this sub-family protein category has been found, the related literature and knowledge records provided by previous users can be retrieved. The possible functions of the protein can then be predicted by use of the literature and records. The proposed framework is applicable to all types of protein families. We describe the search for a plant lipid transfer protein (PLTP) with use of the framework. The system KS-PLTP functions to map an unknown sequence to the sub-family of the PLTP knowledge base and predict the sequence's possible function. The prediction rate of KS-PLTP reached 89.6%.
机译:由于高通量序列,复杂的分析工具和无序的文献,大多数用于注释蛋白质家族的传统方法效率不高,因此无法重复使用。在这里,我们描述了一个框架,即蛋白质家族的知识共享(KSPF),该框架使用序列模式数据挖掘和知识管理来改进传统方法。它分为三个模块:自动化,检索和优化。该框架构建了一个环境,使生物学研究人员能够提交未知的蛋白质序列并搜索有关其亚家族的信息。一旦找到该亚家族蛋白类别,就可以检索以前用户提供的相关文献和知识记录。然后可以通过使用文献和记录来预测蛋白质的可能功能。拟议的框架适用于所有类型的蛋白质家族。我们描述了使用框架搜索植物脂质转移蛋白(PLTP)。系统KS-PLTP的功能是将未知序列映射到PLTP知识库的子家族,并预测序列的可能功能。 KS-PLTP的预测率达到89.6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号