首页> 外文会议>Annual Conference of Japanese Society for Medical and Biological Engineering;Annual International Conference of the IEEE Engineering in Medicine and Biology Society >Signal-processing-based bioinformatics approach for the identification of influenza A virus subtypes in Neuraminidase genes
【24h】

Signal-processing-based bioinformatics approach for the identification of influenza A virus subtypes in Neuraminidase genes

机译:基于信号处理的生物信息学方法用于鉴定神经氨酸酶基因中的甲型流感病毒亚型

获取原文

摘要

Neuraminidase (NA) genes of influenza A virus is a highly potential candidate for antiviral drug development that can only be realized through true identification of its sub-types. In this paper, in order to accurately detect the sub-types, a hybrid predictive model is therefore developed and tested over proteins obtained from the four subtypes of the influenza A virus, namely, H1N1, H2N2, H3N2 and H5N1 that caused major pandemics in the twentieth century. The predictive model is built by the following four main steps; (i) decoding the protein sequences into numerical signals by means of EIIP amino acid scale, (ii) analysing these signals (protein sequences) by using Discrete Fourier Transform (DFT) and extracting DFT-based features, (iii) selecting more influential sub-set of the features by using the F-score statistical feature selection method, and finally (iv) building a predictive model on the feature sub-set by using support vector machine classifier. The protein sequences were chosen as to be of high percentage identity that they demonstrate within individual influenza subtype classes and high variation that they display in the percentage identity. This makes the proteins very difficult to distinguish from each other even they belong to different subtypes. Given this set of the proteins, the predictive model yielded 98.3% accuracy based on a 5-fold cross validation. This also results in a twenty feature sub-set that can also help reveal spectral characteristics of the subtypes. The proposed model is promising and can easily be generalized for other similar studies.
机译:甲型流感病毒的神经氨酸酶(NA)基因是抗病毒药物开发的高度潜在候选者,只有通过真正鉴定其亚型才能实现。在本文中,为了准确地检测亚型,因此开发了一种混合预测模型,并测试了从甲型流感病毒的四种亚型(即H1N1,H2N2,H3N2和H5N1)中引起重大流行病的蛋白获得的蛋白质。二十世纪。预测模型通过以下四个主要步骤构建: (i)通过EIIP氨基酸等级将蛋白质序列解码为数字信号,(ii)使用离散傅立叶变换(DFT)分析这些信号(蛋白质序列)并提取基于DFT的特征,(iii)选择更具影响力的亚基使用F分数统计特征选择方法对特征集进行设置,最后(iv)使用支持向量机分类器在特征子集上建立预测模型。选择蛋白质序列是因为它们在单个流感亚型类别中表现出高百分比同一性,并且在百分比同一性中表现出高变异性。即使蛋白质属于不同的亚型,这也使蛋白质很难彼此区分。给定这组蛋白质,基于5倍交叉验证,预测模型的准确性为98.3%。这也导致了二十个特征子集,这也可以帮助揭示子类型的光谱特征。所提出的模型是有前途的,可以很容易地推广到其他类似的研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号