首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Comparative study of an HIV risk scorecard and regression models to rank effects of demographic characteristics on risk of aquiring an HIV infection

Comparative study of an HIV risk scorecard and regression models to rank effects of demographic characteristics on risk of aquiring an HIV infection




This research paper covers the development of an HIV risk scorecard using SAS Enterprise Miner™. The HIV risk scorecard was developed using the 2007 South African annual antenatal HIV and syphilis seroprevalence data. Limited comparisons are made with a more recent 2010 antenatal database. Antenatal data contains various demographic characteristics for each pregnant woman, such as pregnant woman's age, male sexual partner's age, population group, level of education, gravidity, parity, HIV and syphilis status. The purpose of this research was to use a scorecard to rank the effects of the demographic characteristics on influencing a pregnant woman's risk of acquiring an HIV infection. The project encompassed the selection of the data sample, classing, selection of demographic characteristics, fitting of a regression model, generation of weights-of-evidence (WOE), calculation of information values (IVs), creation and validation of an HIV risk scorecard. The educational level and syphilis status of the pregnant women produced information values below 0.05 and were rejected from inclusion in the final HIV risk scorecard. Based on their respective information values, the following four demographic characteristics of the pregnant women were found to be of medium predictive strength and thus included in the final HIV risk scorecard; pregnant woman's age, age of male sexual partner, gravidity and parity. The age of the pregnant woman had the highest information value and Gini coefficient. The final objective of this research was to demonstrate that a binned variable HIV risk scorecard can provide as much risk ranking as any other regression based model.
机译:本研究论文涵盖了使用SAS Enterprise Miner™开发HIV风险计分卡的过程。 HIV风险计分卡是根据2007年南非年度产前HIV和梅毒血清阳性率数据开发的。使用更新的2010年产前数据库进行的比较有限。产前数据包含每个孕妇的各种人口统计学特征,例如孕妇的年龄,男性性伴侣的年龄,人口群体,受教育程度,妊娠率,均等性,HIV和梅毒状况。这项研究的目的是使用记分卡对人口统计学特征对孕妇感染HIV风险的影响进行排名。该项目包括数据样本的选择,分类,人口统计学特征的选择,回归模型的拟合,证据权重的生成(WOE),信息价值的计算(IV),HIV风险计分卡的创建和验证。孕妇的文化程度和梅毒状况产生的信息值低于0.05,被拒绝纳入最终的HIV风险计分卡。根据他们各自的信息值,发现孕妇的以下四个人口统计学特征具有中等预测强度,因此已包括在最终的HIV风险评分卡中。孕妇的年龄,男性性伴侣的年龄,妊娠和均等。孕妇的年龄具有最高的信息价值和基尼系数。这项研究的最终目的是证明分类的HIV风险计分卡可以提供与其他任何基于回归的模型一样多的风险等级。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号