首页> 外文会议>Advances in Knowledge Discovery and Data Mining >Evaluation of Techniques for Classifying Biological Sequences
【24h】

Evaluation of Techniques for Classifying Biological Sequences

机译:生物序列分类技术的评价

获取原文

摘要

In recent years we have witnessed an exponential increase in the amount of biological information, either DNA or protein sequences, that has become available in public databases. This has been followed by an increased interest in developing computational techniques to automatically classify these large volumes of sequence data into various categories corresponding to either their role in the chromosomes, their structure, and/or their function. In this paper we evaluate some of the widely-used sequence classification algorithms and develop a framework for modeling sequences in a fashion so that traditional machine learning algorithms, such as support vector machines, can be applied easily. Our detailed experimental evaluation shows that the SVM-based approaches are able to achieve higher classification accuracy compared to the more traditional sequence classification algorithms such as Markov model based techniques and K-nearest neighbor based approaches.
机译:近年来,我们目睹了DNA或蛋白质序列等生物学信息的数量呈指数增长,这种信息已在公共数据库中提供。随之而来的是对开发计算技术的兴趣,这些计算技术用于将这些大量的序列数据自动分类为与它们在染色体中的作用,它们的结构和/或它们的功能相对应的各种类别。在本文中,我们评估了一些广泛使用的序列分类算法,并开发了一种以序列方式建模的框架,以便可以轻松应用传统的机器学习算法,例如支持向量机。我们详细的实验评估表明,与传统的序列分类算法(例如基于马尔可夫模型的技术和基于K近邻的方法)相比,基于SVM的方法能够实现更高的分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号