首页> 美国卫生研究院文献>Computational and Structural Biotechnology Journal >Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology
【2h】

Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology

机译:用于大数据生物学的药用药物组学和知识的数据挖掘方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Molecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The present study reviews the usage of KNApSAcK Family DB in metabolomics and related area, discusses several statistical methods for handling multivariate data and shows their application on Indonesian blended herbal medicines (Jamu) as a case study. Exploration using Biplot reveals many plants are rarely utilized while some plants are highly utilized toward specific efficacy. Furthermore, the ingredients of Jamu formulas are modeled using Partial Least Squares Discriminant Analysis (PLS-DA) in order to predict their efficacy. The plants used in each Jamu medicine served as the predictors, whereas the efficacy of each Jamu provided the responses. This model produces 71.6% correct classification in predicting efficacy. Permutation test then is used to determine plants that serve as main ingredients in Jamu formula by evaluating the significance of the PLS-DA coefficients. Next, in order to explain the role of plants that serve as main ingredients in Jamu medicines, information of pharmacological activity of the plants is added to the predictor block. Then N-PLS-DA model, multiway version of PLS-DA, is utilized to handle the three-dimensional array of the predictor block. The resulting N-PLS-DA model reveals that the effects of some pharmacological activities are specific for certain efficacy and the other activities are diverse toward many efficacies. Mathematical modeling introduced in the present study can be utilized in global analysis of big data targeting to reveal the underlying biology.
机译:随着Omics领域(例如基因组学,转录组学,蛋白质组学和代谢组学)的最新进展,分子生物学数据迅速增长,这需要开发数据库和有效存储,检索,整合和分析大数据的方法。本研究回顾了KNApSAcK Family DB在代谢组学及相关领域的用途,讨论了处理多元数据的几种统计方法,并以案例研究的形式显示了其在印度尼西亚混合草药(Jamu)中的应用。使用Biplot进行的探索表明,许多植物很少被利用,而某些植物则被用于特定功效。此外,使用偏最小二乘判别分析(PLS-DA)对Jamu配方的成分进行建模,以预测其功效。每种Jamu药物中使用的植物都可以作为预测指标,而每种Jamu的功效都可以提供响应。该模型在预测功效方面产生71.6%的正确分类。然后通过评估PLS-DA系数的显着性,使用置换检验来确定作为Jamu公式中主要成分的植物。接下来,为了解释在Jamu药物中用作主要成分的植物的作用,将植物的药理活性信息添加到了预测变量中。然后,使用N-PLS-DA模型(PLS-DA的多路版本)来处理预测器块的三维数组。所得的N-PLS-DA模型表明,某些药理活性的作用对于某些功效具有特异性,而其他活性在许多功效上各不相同。本研究中引入的数学模型可用于对大数据目标进行全局分析以揭示潜在的生物学。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号