首页> 外文会议>Bioinformatics Research and Applications; Lecture Notes in Bioinformatics; 4463 >Space and Time Efficient Algorithms to Discover Endogenous RNAi Patterns in Complete Genome Data
【24h】

Space and Time Efficient Algorithms to Discover Endogenous RNAi Patterns in Complete Genome Data

机译:在完整基因组数据中发现内源性RNAi模式的时空高效算法

获取原文
获取原文并翻译 | 示例

摘要

RNAi, short for RNA Interference, a phenomenon of inhibiting the expression of genes, is widely adopted in laboratories for the study of pathways and determination of gene function. Recent studies have shown that RNAi could be used as an approach to treat diseases like cancers and some genetic disorders in which the down-regulation of a protein could prevent or stop progression of the disease. In [7], the problem of detecting endogenous dsRNA control elements and their corresponding mRNA target,i.e., the gene under RNAi control by degradation, in complete genomes of species using a suffix tree data structure is discussed. While the algorithm identifies triple repeats in the genome sequence in linear time, its very high memory requirement (12 GB for the C. elegans genome of size 100 Mbp) becomes a bottleneck for processing genomes of higher order. In this paper, we give algorithms that are space and time efficient in practice than the suffix tree based algorithm. Our algorithms are based on simple array data structures and adopt basic sorting techniques to identify the desired patterns in a given genome sequence. We achieve a speedup of 23 and reduction in memory requirement by a factor of 12 for the C. elegans genome, over the suffix tree approach, making the processing of higher order genomes possible for detecting such endogenous controls and targets for RNAi by degradation.
机译:RNAi是RNA干扰的一种缩写,一种抑制基因表达的现象,在实验室中广泛用于研究途径和确定基因功能。最近的研究表明,RNAi可以用作治疗癌症和某些遗传性疾病等疾病的方法,其中蛋白质的下调可以预防或阻止疾病的进展。在[7]中,讨论了使用后缀树数据结构在物种的完整基因组中检测内源性dsRNA控制元件及其相应的mRNA靶标,即通过降解控制RNAi的基因的问题。尽管该算法可以在线性时间内识别基因组序列中的三重重复序列,但其非常高的内存需求(线虫基因组大小为100 Mbp的内存为12 GB)成为处理高阶基因组的瓶颈。在本文中,与基于后缀树的算法相比,我们在实践中提供了时空高效的算法。我们的算法基于简单的数组数据结构,并采用基本的分类技术来识别给定基因组序列中的所需模式。通过后缀树方法,对于秀丽隐杆线虫基因组,我们实现了23的加速并且将内存需求减少了12倍,这使得处理更高阶的基因组成为可能,从而可以通过降解检测此类内源性对照和RNAi靶标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号