首页> 外文期刊>BMC Bioinformatics >Prediction of bioluminescent proteins by using sequence-derived features and lineage-specific scheme
【24h】

Prediction of bioluminescent proteins by using sequence-derived features and lineage-specific scheme

机译:使用序列衍生特征和谱系特异性方案预测生物发光蛋白

获取原文
           

摘要

Background Bioluminescent proteins (BLPs) widely exist in many living organisms. As BLPs are featured by the capability of emitting lights, they can be served as biomarkers and easily detected in biomedical research, such as gene expression analysis and signal transduction pathways. Therefore, accurate identification of BLPs is important for disease diagnosis and biomedical engineering. In this paper, we propose a novel accurate sequence-based method named PredBLP (Prediction of BioLuminescent Proteins) to predict BLPs. Results We collect a series of sequence-derived features, which have been proved to be involved in the structure and function of BLPs. These features include amino acid composition, dipeptide composition, sequence motifs and physicochemical properties. We further prove that the combination of four types of features outperforms any other combinations or individual features. To remove potential irrelevant or redundant features, we also introduce Fisher Markov Selector together with Sequential Backward Selection strategy to select the optimal feature subsets. Additionally, we design a lineage-specific scheme, which is proved to be more effective than traditional universal approaches. Conclusion Experiment on benchmark datasets proves the robustness of PredBLP. We demonstrate that lineage-specific models significantly outperform universal ones. We also test the generalization capability of PredBLP based on independent testing datasets as well as newly deposited BLPs in UniProt. PredBLP is proved to be able to exceed many state-of-art methods. A web server named PredBLP, which implements the proposed method, is free available for academic use.
机译:背景技术生物发光蛋白(BLP)广泛存在于许多活生物体中。由于BLP具有发射光的功能,因此它们可以用作生物标记,并且可以在生物医学研究中轻松检测到,例如基因表达分析和信号转导途径。因此,准确鉴定BLP对疾病诊断和生物医学工程很重要。在本文中,我们提出了一种名为PredBLP(生物发光蛋白的预测)的新型基于精确序列的方法来预测BLP。结果我们收集了一系列序列特征,这些特征已被证明与BLP的结构和功能有关。这些特征包括氨基酸组成,二肽组成,序列基序和理化性质。我们进一步证明,四种类型的特征组合优于任何其他组合或单个特征。为了删除潜在的不相关或冗余特征,我们还引入了Fisher Markov选择器以及顺序后向选择策略来选择最佳特征子集。此外,我们设计了特定于谱系的方案,该方案被证明比传统的通用方法更有效。结论对基准数据集的实验证明了PredBLP的鲁棒性。我们证明了特定于血统的模型明显优于通用模型。我们还根据独立的测试数据集以及UniProt中新存放的BLP来测试PredBLP的泛化能力。事实证明,PredBLP可以超越许多最新方法。实现所提出方法的名为PredBLP的Web服务器可免费用于学术用途。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号