首页> 外文会议>International workshop on comparative genomics >Non-parametric and Semi-parametric Support Estimation Using SEquential RESampling Random Walks on Biomolecular Sequences
【24h】

Non-parametric and Semi-parametric Support Estimation Using SEquential RESampling Random Walks on Biomolecular Sequences

机译:在生物分子序列上使用顺序重采样随机游走的非参数和半参数支持估计

获取原文

摘要

Non-parametric and semi-parametric resampling procedures are widely used to perform support estimation in computational biology and bioinformatics. Among the most widely used methods in this class is the standard bootstrap method, which consists of random sampling with replacement. While not requiring assumptions about any particular parametric model for resampling purposes, the bootstrap and related techniques assume that sites are independent and identically distributed (i.i.d.). The i.i.d. assumption can be an over-simplification for many problems in computational biology and bioinformatics. In particular, sequential dependence within biomolecular sequences is often an essential biological feature due to biochemical function, evolutionary processes such as recombination, and other factors. To relax the simplifying i.i.d. assumption, we propose a new non-parametric/semi-parametric sequential resampling technique that generalizes "Heads-or-Tails" mirrored inputs, a simple but clever technique due to Landan and Graur. The generalized procedure takes the form of random walks along either aligned or unaligned biomolecular sequences. We refer to our new method as the SERES (or "SEquential RESampling") method. To demonstrate the performance of the new technique, we apply SERES to estimate support for the multiple sequence alignment problem. Using simulated and empirical data, we show that SERES-based support estimation yields comparable or typically better performance compared to state-of-the-art methods.
机译:非参数和半参数重采样程序被广泛用于在计算生物学和生物信息学中执行支持估计。在此类中,最广泛使用的方法是标准的自举方法,该方法包括随机抽样和替换抽样。虽然出于重采样目的无需假设任何特定参数模型,但引导程序和相关技术假定站点是独立的且分布均匀(即i.d.)。 i.d.对于计算生物学和生物信息学中的许多问题,假设可能过于简化。特别地,由于生化功能,诸如重组的进化过程和其他因素,生物分子序列内的顺序依赖性通常是必不可少的生物学特征。放松简化i.i.d.假设,我们提出了一种新的非参数/半参数顺序重采样技术,该技术可以概括“ Heads or Tails”镜像输入,这是Landan和Graur提出的一种简单但巧妙的技术。通用程序采取沿着比对或不比对的生物分子序列随机游动的形式。我们将新方法称为SERES(或“顺序重采样”)方法。为了演示新技术的性能,我们应用SERES来估计对多序列比对问题的支持。使用模拟和经验数据,我们表明,与最新方法相比,基于SERES的支持估算产生的性能可比或通常更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号