首页> 外文期刊>AIDS Research and Human Retroviruses >Robust supervised and unsupervised statistical learning for HIV type 1 coreceptor usage analysis.
【24h】

Robust supervised and unsupervised statistical learning for HIV type 1 coreceptor usage analysis.

机译:强大的监督和无监督的统计学习,用于HIV 1型共受体使用分析。

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Human immunodeficiency virus type 1 (HIV-1) isolates differ in their use of coreceptors to enter target cells. This has important implications for both viral pathogenicity and susceptibility to entry inhibitors, recently approved or under development. Predicting HIV-1 coreceptor usage on the basis of sequence information is a challenging task, due to the high variability of the envelope. The associations of the whole HIV-1 envelope genetic features (subtype, mutations, insertions-deletions, physicochemical properties) and clinical markers (viral RNA load, CD8(+), CD4(+) T cell counts) with viral tropism were investigated, using a set of 2896 (659 after filter, 593 patients) sequence-tropism pairs available at the Los Alamos HIV database. Bootstrapped hierarchical clustering was used to assess mutational covariation. Univariate and multivariate analysis was performed to assess the relative importance of different features. Different machine learning (logistic regression, support vector machines, decision trees, rule bases, instance based reasoning) and feature selection (filter and embedded) methods, along with loss functions (accuracy, AUC of ROC curves, sensitivity, specificity, f-measure), were applied and compared for the classification of X4 variants. Extra-sample error estimation was assessed via multiple cross-validation and adjustments for multiple testing. A high-performing, compact, and interpretable logistic regression model was derived to infer HIV-1 coreceptor tropism for a given patient [accuracy = 92.76 (SD 3.07); AUC = 0.93 (SD 0.04)].
机译:人类1型免疫缺陷病毒(HIV-1)分离株在使用共感受器进入靶细胞方面有所不同。这对病毒致病性和对近期批准或正在开发中的进入抑制剂的敏感性都具有重要意义。由于包膜的高度可变性,基于序列信息预测HIV-1共受体的使用是一项艰巨的任务。研究了整个HIV-1包膜的遗传特征(亚型,突变,插入-缺失,理化特性)和临床标志物(病毒RNA载量,CD8(+),CD4(+)T细胞计数)与病毒嗜性的关联,在洛斯阿拉莫斯(Los Alamos)HIV数据库中使用了一组2896个(过滤后为659个,共593例患者)序列-向性对。自举分层聚类用于评估突变协变。进行单变量和多变量分析以评估不同特征的相对重要性。不同的机器学习(逻辑回归,支持向量机,决策树,规则库,基于实例的推理)和特征选择(过滤器和嵌入式)方法,以及损失函数(准确性,ROC曲线的AUC,敏感性,特异性,f量度) ),并进行比较以比较X4变体的分类。通过多次交叉验证和针对多次测试的调整来评估样本外误差估计。得出了一个高性能,紧凑且可解释的逻辑回归模型,以推断给定患者的HIV-1共感受器嗜性[准确性= 92.76(SD 3.07); AUC = 0.93(SD 0.04)]。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号