首页> 外文学位 >Homals-clustering analysis and its applications in computational sequence analysis.
【24h】

Homals-clustering analysis and its applications in computational sequence analysis.

机译:霍马斯聚类分析及其在计算序列分析中的应用。

获取原文
获取原文并翻译 | 示例

摘要

Searching conserved sequence patterns of known cis-regulatory elements not only provides an initial step towards elucidating their structures and mechanisms, but also helps greatly the prediction of novel regulatory elements. However, due to some special properties of those elements, such as the lack of primary sequence similarity or allowance of variations in nucleotide bases, TRANSFAC and miRBase databases still rely on expert systems to perform the search manually. To tackle the challenge of automatically searching conserved sequence patterns, we developed a novel method, homals-clustering analysis, which clusters sequences based on the sharing of grouped N-mers (representing conserved patterns). Our proposed Homals-clustering analysis consolidates a decryption of N-mers, homogeneity analysis, and newly designed jigsaw-puzzle clustering and multi-layer clustering strategy into a unified framework. We conducted the evaluation of its performance on yeast data of TRANSFAC and human and mouse data in miRBase by comparing with several related studies and methods and the results showed that our method possess the property of detecting conserved patterns with high sensitivity and robustness. Most importantly, since it requires no expert intervention, it enables users without expert knowledge to exploit those databases on a up-to-date basis.
机译:搜索已知的顺式调控元件的保守序列模式,不仅为阐明其结构和机制提供了起始步骤,而且还极大地帮助了对新型调控元件的预测。但是,由于那些元素的某些特殊属性,例如缺少一级序列相似性或允许核苷酸碱基发生变化,TRANSFAC和miRBase数据库仍然依赖于专家系统来手动执行搜索。为了解决自动搜索保守序列模式的挑战,我们开发了一种新方法,即霍姆斯聚类分析,该方法基于分组的N-mers共享(代表保守模式)对序列进行聚类。我们提出的Homals聚类分析将N-mers的解密,同质性分析以及新设计的拼图拼图聚类和多层聚类策略整合到一个统一的框架中。通过与若干相关研究和方法进行比较,我们对TRANSFAC的酵母数据以及miRBase中的人和小鼠数据进行了性能评估,结果表明我们的方法具有检测保守模式的特性,具有高灵敏度和鲁棒性。最重要的是,由于它不需要专家干预,因此它使没有专家知识的用户可以最新地利用这些数据库。

著录项

  • 作者

    Hsieh, Ya-Ching.;

  • 作者单位

    University of California, Los Angeles.;

  • 授予单位 University of California, Los Angeles.;
  • 学科 Bioinformatics.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 95 p.
  • 总页数 95
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号