首页> 外文OA文献 >Integrating Overlapping Structures and Background Information of Words Significantly Improves Biological Sequence Comparison
【2h】

Integrating Overlapping Structures and Background Information of Words Significantly Improves Biological Sequence Comparison

机译:整合单词的重叠结构和背景信息可显着改善生物序列比较

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Word-based models have achieved promising results in sequence comparison. However, as the important statistical properties of words in biological sequence, how to use the overlapping structures and background information of the words to improve sequence comparison is still a problem. This paper proposed a new statistical method that integrates the overlapping structures and the background information of the words in biological sequences. To assess the effectiveness of this integration for sequence comparison, two sets of evaluation experiments were taken to test the proposed model. The first one, performed via receiver operating curve analysis, is the application of proposed method in discrimination between functionally related regulatory sequences and unrelated sequences, intron and exon. The second experiment is to evaluate the performance of the proposed method with f-measure for clustering Hepatitis E virus genotypes. It was demonstrated that the proposed method integrating the overlapping structures and the background information of words significantly improves biological sequence comparison and outperforms the existing models.
机译:基于单词的模型在序列比较中取得了可喜的结果。然而,作为单词在生物序列中的重要统计特性,如何利用单词的重叠结构和背景信息来改善序列比较仍然是一个问题。本文提出了一种新的统计方法,该方法整合了生物序列中单词的重叠结构和背景信息。为了评估这种整合对于序列比较的有效性,采取了两组评估实验来测试所提出的模型。第一个通过接收器工作曲线分析执行,是所提出的方法在区分功能相关的调控序列和无关序列(内含子和外显子)中的应用。第二个实验是用f度量评估该方法对戊型肝炎病毒基因型聚类的性能。结果表明,所提出的融合重叠结构和词的背景信息的方法显着改善了生物序列比较,并且优于现有模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号