...
首页> 外文期刊>AIDS Research and Human Retroviruses >Robust Supervised and Unsupervised Statistical Learning for HIV Type 1 Coreceptor Usage Analysis
【24h】

Robust Supervised and Unsupervised Statistical Learning for HIV Type 1 Coreceptor Usage Analysis

机译:强大的有监督和无监督的统计学习,用于HIV 1型共感受器使用分析

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Human immunodeficiency virus type 1 (HIV-1) isolates differ in their use of coreceptors to enter target cells. This has important implications for both viral pathogenicity and susceptibility to entry inhibitors, recently approved or under development. Predicting HIV-1 coreceptor usage on the basis of sequence information is a challenging task, due to the high variability of the envelope. The associations of the whole HIV-1 envelope genetic features (subtype, mutations, insertions–deletions, physicochemical properties) and clinical markers (viral RNA load, CD8+, CD4+ T cell counts) with viral tropism were investigated, using a set of 2896 (659 after filter, 593 patients) sequence-tropism pairs available at the Los Alamos HIV database. Bootstrapped hierarchical clustering was used to assess mutational covariation. Univariate and multivariate analysis was performed to assess the relative importance of different features. Different machine learning (logistic regression, support vector machines, decision trees, rule bases, instance based reasoning) and feature selection (filter and embedded) methods, along with loss functions (accuracy, AUC of ROC curves, sensitivity, specificity, f-measure), were applied and compared for the classification of X4 variants. Extra-sample error estimation was assessed via multiple cross-validation and adjustments for multiple testing. A high-performing, compact, and interpretable logistic regression model was derived to infer HIV-1 coreceptor tropism for a given patient [accuracy=92.76 (SD 3.07); AUC=0.93 (SD 0.04)].
机译:人类1型免疫缺陷病毒(HIV-1)分离株在使用共感受器进入靶细胞方面有所不同。这对病毒致病性和对最近批准或正在开发中的进入抑制剂的敏感性都具有重要意义。由于包膜的高度可变性,基于序列信息预测HIV-1共受体的使用是一项艰巨的任务。整个HIV-1包膜的遗传特征(亚型,突变,插入-缺失,理化特性)与临床标记(病毒RNA载量,CD8 + ,CD4 + )之间的关联使用Los Alamos HIV数据库中可用的一组2896(过滤后为659,593位患者)对具有病毒嗜性的T细胞计数进行了调查。自举分层聚类用于评估突变协变。进行单变量和多变量分析以评估不同特征的相对重要性。不同的机器学习(逻辑回归,支持向量机,决策树,规则库,基于实例的推理)和特征选择(过滤器和嵌入式)方法,以及损失函数(准确性,ROC曲线的AUC,敏感性,特异性,f量度) ),并进行比较以比较X4变体的分类。通过多次交叉验证和针对多次测试的调整来评估样本外误差估计。得出了一个高性能,紧凑且可解释的逻辑回归模型,以推断给定患者的HIV-1共感受器嗜性[准确性= 92.76(SD 3.07); AUC = 0.93(SD 0.04)]。

著录项

  • 来源
    《AIDS Research and Human Retroviruses》 |2009年第3期|305-314|共10页
  • 作者单位

    Department of Virology, National Institute for Infectious Diseases “L. Spallanzani,” 00149 Rome, Italy.;

    Infectious Diseases Clinic, Catholic University of Sacro Cuore, 00148 Rome, Italy.;

    Department of Computer Science and Automation (DIA), University of Roma TRE, 00146 Rome, Italy.;

    Department of Computer Science and Automation (DIA), University of Roma TRE, 00146 Rome, Italy.;

    Infectious Diseases Clinic, Catholic University of Sacro Cuore, 00148 Rome, Italy.;

    Section of Microbiology, Department of Molecular Biology, University of Siena, Policlinico “Le Scotte,” Viale Bracci, Siena, Italy.;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号