首页> 外文会议>The 2010 International Joint Conference on Neural Networks >DNA base-calling using polynomial classifiers
【24h】

DNA base-calling using polynomial classifiers

机译:使用多项式分类器进行DNA碱基检出

获取原文

摘要

Base-calling is one of many problems that can be solved using pattern recognition, the act of classifying raw data based on prior or statistical information extracted from the data into various classes. In this paper, we propose a new framework using polynomial classifiers to model electropherogram traces obtained from ABI sequencing machines to perform base-calling. Initially, pre-processing, which includes segmented normalization and peak sharpening, needs to be performed to reduce the imperfections caused in a trace as a result of the chemistry involved. Discriminative feature vectors are then extracted from the chromatogram traces and are expanded to a higher dimensional space by second order polynomial expansion. A linear classifier is then trained and bases are classified respectively. Chromatogram traces that were chosen for analysis belong to Homo sapiens, Saccharomyces mikatae and Drosophila melanogaster. Simulation results indicated an accuracy of up to 99.2% upon testing three different chromatogram traces consisting of about 600 to 800 bases each. The proposed model's performance was compared to the existing standards: ABI and PHRED in terms of insertion, deletion and substitution errors. Simulation evidence indicated that the designed model performs comparably or slightly better than ABI in terms of deletion and insertion errors. Moreover, polynomial classifier resulted in negligible substitution errors compared to ABI. Polynomial classifier was also observed to perform comparable to PHRED in terms of deletion error and substitution errors. The results obtained demonstrate the potential of this model to perform base-calling.
机译:碱基检出是可以使用模式识别解决的许多问题之一,模式识别是基于从数据中提取的先验或统计信息将原始数据分类为各种类别的行为。在本文中,我们提出了一个使用多项式分类器的新框架,以对从ABI测序仪获得的电泳图进行建模以执行碱基检出。最初,需要进行预处理,包括分段归一化和峰锐化,以减少痕量由于所涉及的化学反应而引起的缺陷。然后从色谱图中提取出可区分的特征向量,并通过二阶多项式展开将其展开到更高维的空间。然后训练线性分类器,并分别对基础进行分类。选择进行分析的色谱图痕迹属于智人,米酒酵母和果蝇。仿真结果表明,测试三种不同的色谱图痕迹(每条约600至800个碱基)时,准确度高达99.2%。将拟议模型的性能与现有标准(ABI和PHRED)在插入,删除和替换错误方面进行了比较。仿真证据表明,在删除和插入错误方面,设计模型的性能与ABI相当或稍好。此外,与ABI相比,多项式分类器产生的替代误差可忽略不计。在删除错误和替换错误方面,还观察到多项式分类器的性能可与PHRED媲美。获得的结果证明了该模型执行碱基检出的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号