首页> 外文学位 >Modeling biological structures via abstract grammars to solve common problems in computational biology.
【24h】

Modeling biological structures via abstract grammars to solve common problems in computational biology.

机译:通过抽象语法对生物结构进行建模,以解决计算生物学中的常见问题。

获取原文
获取原文并翻译 | 示例

摘要

Grammars are generally understood to be the set of rules that define the relationships between elements of a language. However, grammars can also be used to elucidate structural relationships within sequences constructed from any finite alphabet. In this work abstract grammars are used to model the primary and secondary structures present in biological data. These grammar models are inferred and applied to efficiently solve various sequence analysis problems in computational biology, including multiple sequence alignment, fragment assembly, database redundancy removal, and structural prediction.;The primary structures, or sequential ordering of symbols, of biological data are first modeled with Lempel-Ziv (LZ) grammars. The results are used to construct a grammar based sequence distance metric which can be used to compare biological sequences by comparing their inferred grammars. This concept is applied to solve several problems involving biological sequence analysis including multiple sequence alignment and phylogenetic clustering. The higher-level secondary structures of biological sequences are then modeled via two novel grammar inference methods. The resulting context-free grammars are used to estimate structural pieces within biological sequences, which can in-turn be used as supplemental information to help guide various sequence analysis algorithms. The use of this approach to develop algorithms for various sequence analysis tasks demonstrates the viability and versatility of using abstract grammars to model biological data.
机译:语法通常被理解为定义语言元素之间关系的一组规则。但是,语法也可用于阐明由任何有限字母构成的序列内的结构关系。在这项工作中,抽象语法用于对生物数据中存在的主要和次要结构进行建模。推断出这些语法模型并将其用于有效解决计算生物学中的各种序列分析问题,包括多重序列比对,片段组装,数据库冗余去除和结构预测。;首先是生物数据的主要结构或符号的顺序排列以Lempel-Ziv(LZ)语法建模。结果用于构建基于语法的序列距离度量,该序列距离度量可用于通过比较其推断的语法来比较生物序列。该概念适用于解决涉及生物序列分析的多个问题,包括多序列比对和系统发生聚类。然后,通过两种新颖的语法推断方法对生物序列的高级二级结构进行建模。所得的无上下文语法用于估计生物序列中的结构片段,而这些片段又可以用作补充信息,以帮助指导各种序列分析算法。使用这种方法开发用于各种序列分析任务的算法,证明了使用抽象语法为生物数据建模的可行性和多功能性。

著录项

  • 作者

    Russell, David James.;

  • 作者单位

    The University of Nebraska - Lincoln.;

  • 授予单位 The University of Nebraska - Lincoln.;
  • 学科 Biology Bioinformatics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 220 p.
  • 总页数 220
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:37:34

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号