首页> 外文期刊>Bioinformatics >The Amordad database engine for metagenomics
【24h】

The Amordad database engine for metagenomics

机译:用于宏基因组学的Amordad数据库引擎

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Several technical challenges in metagenomic data analysis, including assembling metagenomic sequence data or identifying operational taxonomic units, are both significant and well known. These forms of analysis are increasingly cited as conceptually flawed, given the extreme variation within traditionally defined species and rampant horizontal gene transfer. Furthermore, computational requirements of such analysis have hindered content-based organization of metagenomic data at large scale.Results: In this article, we introduce the Amordad database engine for alignment-free, content-based indexing of metagenomic datasets. Amordad places the metagenome comparison problem in a geometric context, and uses an indexing strategy that combines random hashing with a regular nearest neighbor graph. This framework allows refinement of the database over time by continual application of random hash functions, with the effect of each hash function encoded in the nearest neighbor graph. This eliminates the need to explicitly maintain the hash functions in order for query efficiency to benefit from the accumulated randomness. Results on real and simulated data show that Amordad can support logarithmic query time for identifying similar metagenomes even as the database size reaches into the millions
机译:动机:宏基因组数据分析中的几个技术挑战,包括组合宏基因组序列数据或确定可操作的生物分类单位,都是重要且众所周知的。鉴于传统上定义的物种内的极端变异和横行的水平基因转移,这些分析形式被越来越多地引用为概念上的缺陷。此外,此类分析的计算要求阻碍了基于内容的宏基因组数据的大规模组织。结果:在本文中,我们介绍了Amordad数据库引擎,用于基于内容的宏序列数据集的无对齐,基于索引的索引。 Amordad将元基因组比较问题放在几何环境中,并使用将随机哈希与规则的最近邻图相结合的索引策略。通过连续应用随机散列函数,此框架允许随着时间的推移对数据库进行细化,每个散列函数的效果都编码在最近的邻居图中。这样就无需显式维护哈希函数,以便使查询效率受益于累积的随机性。真实和模拟数据的结果表明,即使数据库规模达到数百万,Amordad仍可以支持对数查询时间来识别相似的基因组

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号