Towards computational improvement of DNA database indexing and short DNA query searching

机译：寻求DNA数据库索引和短DNA查询搜索的计算改进

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In order to facilitate and speed up the search of massive DNA databases, the database is indexed at the beginning, employing a mapping function. By searching through the indexed data structure, exact query hits can be identified. If the database is searched against an annotated DNA query, such as a known promoter consensus sequence, then the starting locations and the number of potential genes can be determined. This is particularly relevant if unannotated DNA sequences have to be functionally annotated. However, indexing a massive DNA database and searching an indexed data structure with millions of entries is a time-demanding process. In this paper, we propose a fast DNA database indexing and searching approach, identifying all query hits in the database, without having to examine all entries in the indexed data structure, limiting the maximum length of a query that can be searched against the database. By applying the proposed indexing equation, the whole human genome could be indexed in 10 hours on a personal computer, under the assumption that there is enough RAM to store the indexed data structure. Analysing the methodology proposed by Reneker, we observed that hits at starting positions are not reported, if the database is searched against a query shorter than nucleotides, such that is the length of the DNA database words being mapped and is the length of the query. A solution of this drawback is also presented.

机译：为了促进和加快海量DNA数据库的搜索速度，该数据库在开始时就采用了映射功能来对索引进行索引。通过搜索索引的数据结构，可以确定确切的查询命中。如果针对带注释的DNA查询（例如已知的启动子共有序列）搜索数据库，则可以确定起始位置和潜在基因的数量。如果必须对未注释的DNA序列进行功能注释，则这尤其重要。但是，索引庞大的DNA数据库并搜索具有数百万个条目的索引数据结构是一个耗时的过程。在本文中，我们提出了一种快速的DNA数据库索引和搜索方法，该方法可以识别数据库中的所有查询命中，而不必检查索引数据结构中的所有条目，从而限制了可以针对数据库搜索的查询的最大长度。通过应用建议的索引方程式，假设有足够的RAM存储索引数据结构，则可以在个人计算机上在10小时内对整个人类基因组进行索引。通过分析Reneker提出的方法，我们观察到，如果数据库是针对比核苷酸短的查询进行搜索的，则不会报告起始位置的匹配，这是被映射的DNA数据库字的长度，也是查询的长度。还提出了该缺点的解决方案。

著录项

期刊名称 Taylor Francis Open Select
作者
Done Stojanov; Sašo Koceski; Aleksandra Mileva; Nataša Koceska; Cveta Martinovska Bande;
展开▼
作者单位

展开▼
年(卷),期 -1(28),5
年度 -1
页码 958–967
总页数 10
原文格式 PDF
正文语种
中图分类
关键词
DNA database fast indexing and search all hits E. coli;

机译：DNA数据库;快速索引和搜索;所有匹配;大肠杆菌;

相似文献

外文文献
中文文献
专利

1. Towards computational improvement of DNA database indexing and short DNA query searching [J] . Stojanov Done, Koceski Saso, Mileva Aleksandra, Biotechnology & Biotechnological Equipment . 2014,第5期

机译：寻求DNA数据库索引和短DNA查询搜索的计算改进
2. Towards computational improvement of DNA database indexing and short DNA query searching [J] . Done Stojanov, Nata?a Koceska, Sa?o Koceski, Biotechnology & Biotechnological Equipment . 2014,第5期

机译：寻求DNA数据库索引和短DNA查询搜索的计算改进
3. Familial searching: A specialist forensic DNA profiling service utilising the National DNA Database? to identify unknown offenders via their relatives - The UK experience [J] . MaguireC.N., McCallumL.A., StoreyC., Forensic science international. Genetics . 2014,第1期

机译：家族搜索：使用国家DNA数据库的专业法医DNA分析服务吗？通过亲戚识别未知罪犯-英国的经验
4. An improvement of the overlap complexity in the spaced seed searching problem between genomic DNAs [C] . Phan-Thuan Do, Cam-Giang Tran-Thi National Foundation for Science and Technology Development Conference on Information and Computer Science . 2015

机译：基因组DNA之间的间隔种子搜索问题的重叠复杂性改善
5. Sampling the potential energy surface of a DNA duplex damaged by a food carcinogen: Force field parameterization by ab initio quantum calculations and conformational searching using molecular mechanics computations. [D] . Wu, Xiangyang. 1999

机译：采样被食物致癌物破坏的DNA双链体的势能表面：通过从头算量子计算和使用分子力学计算进行构象搜索，对力场进行参数化。
6. SAM: String-based sequence search algorithm for mitochondrial DNA database queries [O] . Alexander Röck, Jodi Irwin, Arne Dür, -1

机译：SAM：用于线粒体DNA数据库查询的基于字符串的序列搜索算法
7. Towards computational improvement of DNA database indexing and short DNA queryudsearching [O] . Stojanov Done, Koceski Saso, Mileva Aleksandra, 2014

机译：迈向DNa数据库索引和短DNa查询的计算改进 ud搜索

Towards computational improvement of DNA database indexing and short DNA query searching

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅