首页> 外文会议>European Conference on Modelling and Simulation >AN EFFICIENT METHOD FOR COMPRESSING AND SEARCHING GENOMIC DATABASES
【24h】

AN EFFICIENT METHOD FOR COMPRESSING AND SEARCHING GENOMIC DATABASES

机译:一种用于压缩和搜索基因组数据库的有效方法

获取原文

摘要

Biological databases are growing significantly, as are the number of queries directed at them. In 2005, the genomic databases at the National Center for Biotechnology Information (NCBI) received about 50 million web hits per day, at peak rates of about 1,900 hits per second. As these databases become more popular, there is increased demand to make them faster and more efficient. In this paper, we propose a method for compressing and searching selected genome databases using techniques appropriate for computers of virtually any size. This search technique is expected to produce its best results with large search sequences against large DNA databases, and lends itself to parallel computation techniques with little communication overhead required. Because the compression algorithm uses a lossless binary encoding format, search results are exact – not approximate. Furthermore, searches take place on the compressed data, obviating the need for decompression prior to executing a search.
机译:生物数据库正在显着增长,因此针对它们的查询数量也是如此。 2005年,全国生物技术信息中心(NCBI)的基因组数据库每天收到约5000万个网页,峰值率为每秒约1,900次。由于这些数据库变得更加流行,因此需求增加,使它们更快,更高效。在本文中,我们提出了一种使用适合于几乎任何大小的计算机的技术来压缩和搜索所选择的基因组数据库的方法。该搜索技术有望通过针对大型DNA数据库的大搜索序列产生最佳效果,并将其自身用于并行计算技术,需要几乎需要的通信开销。因为压缩算法使用无损二进制编码格式,所以搜索结果精确 - 不近似。此外,搜索在压缩数据上进行,避免在执行搜索之前对解压缩的需要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号