首页> 外文会议>International conference on computational methods in systems biology >Frameshift Correction in De Novo Assembled Transcriptome Data Using Peptide Data, Blast Sequence Alignments and Hidden Markov Models
【24h】

Frameshift Correction in De Novo Assembled Transcriptome Data Using Peptide Data, Blast Sequence Alignments and Hidden Markov Models

机译:De Novo组装转录组数据中使用肽段数据,blast序列比对和隐马尔可夫模型进行移码校正

获取原文

摘要

Frameshift errors in de novo sequenced transcriptome data are difficult to find since not all transcripts are covered with publicly available and cu-rated reference data. One approach for finding frameshift errors are hidden markov models. HMMs are a widely used approach in bioinformatics to identify patterns in sequences such as coding or conserved regions [1]. However there haven't been made many efforts using HMMs for detecting frameshift errors in nucleotide sequences. Here we introduce an approach for frameshift correction using hidden markov models, blast alignment data and peptide data derived from mass spectro-metry [2]. This algorithm is implemented as pipeline with three correction steps for transcriptomic sequences. First, it employs peptide data for a preliminary correction, followed by aligning the investigated sequences against publicly available protein databases such as the SwissProt database utilizing the BlastX alignment tool. Finally, the resulting alignment files for correction are used for creating training data sets for the HMM using known coding and shifted areas on the sequence. The trained HMM is then used to perform the final identification.
机译:很难找到从头测序的转录组数据中的移码错误,因为并非所有的转录本都覆盖有公开可获得的经过计算的参考数据。查找移码错误的一种方法是隐藏的马尔可夫模型。 HMM是生物信息学中广泛使用的方法,用于识别序列中的模式,例如编码区或保守区[1]。但是,使用HMM来检测核苷酸序列中的移码错误并没有付出很多努力。在这里,我们介绍了一种使用隐藏的马尔可夫模型,blast对齐数据和从质谱法得出的肽数据[2]进行移码校正的方法。该算法被实现为具有转录组序列的三个校正步骤的流水线。首先,它使用肽段数据进行初步校正,然后使用BlastX比对工具将研究的序列与可公开获得的蛋白质数据库(例如SwissProt数据库)进行比对。最后,使用已知的编码和序列上的移位区域,将得到的用于校正的比对文件用于为HMM创建训练数据集。然后,将训练有素的HMM用于执行最终识别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号