...
首页> 外文期刊>Soft computing: A fusion of foundations, methodologies and applications >Sequence- and structure-based prediction of amyloidogenic regions in proteins
【24h】

Sequence- and structure-based prediction of amyloidogenic regions in proteins

机译:基于蛋白质的淀粉样蛋白区域的序列和结构的预测

获取原文
获取原文并翻译 | 示例
           

摘要

Machine learning methods are increasingly used in proteomics research, especially in analyzing and predicting protein structures, functions, subcellular localizations and interactions. However, much research in recent years has focused on protein misfolding problem and the impact of unfolded and defective proteins on cell dysfunction, due to its considerable importance for molecular medicine. These abnormal proteins degradation and deposition often result in the formation of certain plaque cores among them the so-called amyloid fibrils which are responsible for an increasing number of highly debilitating disorders in humans. Yet, a significant challenge remains, especially in understanding the underlying causes and major risk factors of these harmful deposits in vital organs and tissues. This paper explores the potential of string kernel-based support vector machines in the prediction of amyloidogenic regions in proteins by incorporating the most informative features of the protein sequence such as predicted secondary structure and solvent accessibility, with a special focus on alpha-helical conformations which seem to be primarily concerned with amyloidogenesis. The performances compared with the most popular methods on Pep424 and Reg33 benchmark datasets indicate the robustness of the predictive model. Furthermore, the results showed accurate prediction of regions promoting fibrillogenesis for experimentally determined amyloid proteins and revealed that the five amino acids Leucine, Glycine, Alanine, Valine and Serine are predominantly present in amyloid-prone regions and confirm that the core regions of an amyloid aggregate are not necessarily fully buried.
机译:机器学习方法越来越多地用于蛋白质组学研究,特别是在分析和预测蛋白质结构,功能,亚细胞局部化和相互作用方面。然而,近年来近年来的研究集中在蛋白质错误折叠问题和展开和缺陷蛋白质对细胞功能障碍的影响,这是由于其分子药的重要性。这些异常的蛋白质降解和沉积通常导致它们中所谓的淀粉样蛋白原纤维形成某些斑块核心,这些淀粉样芯原纤维是人类中越来越多的高度衰弱性疾病的原因。然而,仍然存在重大挑战,特别是在理解重要器官和组织中这些有害沉积物的潜在原因和主要危险因素。本文通过掺入蛋白质序列(如预测的二级结构和溶剂可接近性)的最佳信息特征,探讨了基于蛋白质体的基于淀粉样品区的淀粉样品区域的潜力,例如预测的二级结构和溶剂可接近性,特别关注α-螺旋形容似乎主要关注淀粉样品发生。与Pep424和Reg33基准数据集上最流行的方法相比,表演表明预测模型的稳健性。此外,结果表明,促进促进实验确定的淀粉样蛋白的纤维发生的区域精确预测,并显示出五个氨基酸亮氨酸,甘氨酸,丙氨酸,缬氨酸和丝氨酸主要存在于淀粉样蛋白易发区域中,并确认淀粉样蛋白聚集体的核心区域不一定完全埋葬。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号