首页> 美国卫生研究院文献>Journal of Computational Biology >Efficient Design of Compact Unstructured RNA Libraries Covering Allk-mers
【2h】

Efficient Design of Compact Unstructured RNA Libraries Covering Allk-mers

机译:涵盖所有结构的紧凑型非结构化RNA库的高效设计聚体

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Current microarray technologies to determine RNA structure or measure protein–RNA interactions rely on single-stranded, unstructured RNA probes on a chip covering together all k-mers. Since space on the array is limited, the problem is to efficiently design a compact library of unstructured ℓ-long RNA probes, where each k-mer is covered at least p times. Ray et al. designed such a library for specific values of k, ℓ, and p using ad-hoc rules. To our knowledge, there is no general method to date to solve this problem. Here, we address the problem of finding a minimum-size covering of all k-mers by ℓ-long sequences with the desired properties for any value of k, ℓ, and p. As we prove that the problem is NP-hard, we give two solutions: the first is a greedy algorithm with a logarithmic approximation ratio; the second, a heuristic greedy approach based on random walks in de Bruijn graphs. The heuristic algorithm works well in practice and produces a library of unstructured RNA probes that is only ∼1.1-times greater in size compared to the theoretical lower bound. Wepresent results for typical values of k and probe lengthsℓ and show that our algorithm generates a library thatis significantly smaller than the library of Ray et al.; moreover, we show thatour algorithm outperforms naive methods. Our approach can be generalized andextended to generate RNA or DNA oligo libraries with other desired properties. Thesoftware is freely available online.
机译:>当前用于确定RNA结构或测量蛋白质与RNA相互作用的微阵列技术依赖于覆盖所有k-mers的芯片上的单链,非结构化RNA探针。由于阵列上的空间有限,因此问题在于有效设计紧凑的无结构λ长RNA探针文库,其中每个k-mer至少覆盖p次。雷等。使用ad-hoc规则为k,ℓ和p的特定值设计了这样的库。据我们所知,迄今为止尚无通用的方法来解决此问题。在这里,我们解决的问题是,通过对所有k,-和p值都具有所需属性的ℓ-长序列,找到所有k-mers的最小尺寸覆盖。当我们证明问题是NP问题时,我们给出两个解决方案:第一个是对数近似比率的贪心算法;第二个是对数算法。第二种是基于de Bruijn图中随机游动的启发式贪婪方法。该启发式算法在实践中效果很好,并且生成了一个非结构化RNA探针库,其大小仅比理论下限大了约1.1倍。我们给出k和探针长度典型值的结果ℓ并表明我们的算法生成了一个库比Ray等人的资料库小得多;此外,我们证明我们的算法胜过幼稚的方法。我们的方法可以概括为扩展以生成具有其他所需特性的RNA或DNA寡核苷酸文库。的该软件可在线免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号