首页> 外文会议>Computer and Information Sciences, 2008 23rd International Symposium on-ISCIS >Gene ontology prediction using compression based distances and alignment scores on both amino acid sequence and secondary structure
【24h】

Gene ontology prediction using compression based distances and alignment scores on both amino acid sequence and secondary structure

机译:使用基于压缩的距离和氨基酸序列以及二级结构上的比对得分的基因本体预测

获取原文

摘要

Normalized Compression Distance (NCD) is a compression based pairwise distance measure. NCD has been shown to perform well in different domains, such as music, biological sequence and text classification. In this study, we use NCD distance together with Smith-Waterman (SW) alignment scores of protein sequences for gene ontology prediction. We find out that, using secondary structure in addition to the amino acid sequence increases the prediction performance when using NCD or SW alignment scores alone. The best contribution ratio of secondary structure for SW alignment scores is 0.25, while it is 0.50 for NCD scores. We also investigate using both NCD and SW together with the amino acid and secondary structure. We find out that this combination results in better prediction than NCD alone, but worse prediction than SW alone.
机译:归一化压缩距离(NCD)是基于压缩的成对距离度量。 NCD已被证明在不同领域表现出色,例如音乐,生物序列和文本分类。在这项研究中,我们将NCD距离与蛋白质序列的Smith-Waterman(SW)比对得分一起用于基因本体预测。我们发现,单独使用NCD或SW比对评分时,除氨基酸序列外,还使用二级结构可提高预测性能。二级结构对SW对齐得分的最佳贡献率为0.25,而对NCD得分为0.50。我们还调查了同时使用NCD和SW以及氨基酸和二级结构的情况。我们发现,这种组合比单独的NCD产生更好的预测,但比单独的SW产生更差的预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号