首页> 外文期刊>Journal of supercomputing >3D visualization and cluster analysis of unstructured protein sequences using ARCSA with a file conversion approach
【24h】

3D visualization and cluster analysis of unstructured protein sequences using ARCSA with a file conversion approach

机译:用文件转换方法使用ArcSA的非结构化蛋白序列的3D可视化和聚类分析

获取原文
获取原文并翻译 | 示例

摘要

This work explains synthesis of protein structures based on the unsupervised learning method known as clustering. Protein structure prediction was performed for different crab and egg datasets with inputs collected from the Protein Data Bank (PDB ID: 3LIG, 2W3Z, 3ZVQ, 2KLR and 2YIZ). The three-dimensional protein structure was merged together with the filtering instances inbuilt in data mining techniques known as MergeSets. The problem description in this proposed methodology, referred to as attribute-related cluster sequence analysis, is to identify a goodworking algorithm for clustering of protein structures by comparing four existing algorithms: k-means, expectation maximization, farthest first and COBWEB. Experiments are conducted with the BioWeka data mining tool, Modeler 9.15 and the PyMOL tool with scripts using the Python programming language. This paper shows that the expectation maximization algorithm is the best for structured protein clustering, and this will also pave the way for identifying better algorithms for supervised learning methods.
机译:本作品基于称为聚类的无监督学习方法解释了蛋白质结构的合成。对于不同的蟹和蛋数据集,对从蛋白质数据库收集的输入进行蛋白质结构预测(PDB ID:3Lig,2W3z,3VQ,2klr和2yiz)。三维蛋白质结构与在称为合并的数据挖掘技术中的过滤实例与过滤实例合并。在该提出的方法中描述的问题描述称为属性相关的聚类序列分析,是通过比较四个现有算法来识别用于聚类蛋白质结构的良好工作算法:K均值,期望最大化,最重要的第一和蜘蛛网。使用Python编程语言使用BioWeka数据挖掘工具,建模器9.15和Pymol工具进行实验。本文表明,期望最大化算法是结构化蛋白质聚类的最佳状态,这也将为识别用于监督学习方法的更好算法来铺平道路。

著录项

  • 来源
    《Journal of supercomputing》 |2020年第6期|4287-4301|共15页
  • 作者

    Vignesh U.; Parvathi R.;

  • 作者单位

    VIT Univ Chennai Campus Sch Comp Sci & Engn Vandalur Kelambakkam Rd Chennai 600127 Tamil Nadu India;

    VIT Univ Chennai Campus Sch Comp Sci & Engn Vandalur Kelambakkam Rd Chennai 600127 Tamil Nadu India;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Protein clustering; Biological data mining; Drug discovery;

    机译:蛋白质聚类;生物数据挖掘;药物发现;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号