首页> 外文学位 >Analysis of data partitioning on correlated data to genetic sequence searches using string matching algorithms.
【24h】

Analysis of data partitioning on correlated data to genetic sequence searches using string matching algorithms.

机译:使用字符串匹配算法对相关数据的数据分区进行遗传序列搜索的分析。

获取原文
获取原文并翻译 | 示例

摘要

To create a distributed approach to genetic database sequence searches requires partitioning the data into multiple sections. However, the nature of the data leaves the possibility of cutting the queried sequence into unrecognizable pieces. Adding overlap for each partition which is less than or equal to half the length of the query sequence corrects this problem. This was demonstrated in this thesis using English texts. English texts were first correlated with genetic data, partitioned into various groupings of sizes, and overlap applied in incremental steps to the smallest partition size. Knuth-Morris-Pratt and Boyer-Moore string search algorithms were used to locate a small query sequence that was cut during partitioning and resolved by the use of overlap.
机译:要创建用于遗传数据库序列搜索的分布式方法,需要将数据分为多个部分。但是,数据的性质使将查询的序列切成无法识别的片段成为可能。为每个分区添加小于或等于查询序列长度一半的重叠,可以解决此问题。本文使用英文文本对此进行了证明。首先将英文文本与遗传数据相关联,分成各种大小的分组,然后以递增的方式将重叠应用于最小的分区大小。使用Knuth-Morris-Pratt和Boyer-Moore字符串搜索算法来定位一个小的查询序列,该查询序列在分区期间被剪切并通过使用重叠进行解析。

著录项

  • 作者

    Nance, David R.;

  • 作者单位

    The University of Alabama in Huntsville.;

  • 授予单位 The University of Alabama in Huntsville.;
  • 学科 Computer science.
  • 学位 M.S.
  • 年度 2006
  • 页码 80 p.
  • 总页数 80
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 TS97-4;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号