GTRAC: fast retrieval from compressed collections of genomic variants

Tatwawadi Kedar; Hernaez Mikel; Ochoa Idoia; Weissman Tsachy

首页> 外文期刊>Bioinformatics >GTRAC: fast retrieval from compressed collections of genomic variants

【24h】

GTRAC: fast retrieval from compressed collections of genomic variants

机译：GTRAC：从基因组变体的压缩集合中快速检索

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Motivation: The dramatic decrease in the cost of sequencing has resulted in the generation of huge amounts of genomic data, as evidenced by projects such as the UK10K and the Million Veteran Project, with the number of sequenced genomes ranging in the order of 10 K to 1 M. Due to the large redundancies among genomic sequences of individuals from the same species, most of the medical research deals with the variants in the sequences as compared with a reference sequence, rather than with the complete genomic sequences. Consequently, millions of genomes represented as variants are stored in databases. These databases are constantly updated and queried to extract information such as the common variants among individuals or groups of individuals. Previous algorithms for compression of this type of databases lack efficient random access capabilities, rendering querying the database for particular variants and/or individuals extremely inefficient, to the point where compression is often relinquished altogether.

机译：动机：测序成本的急剧下降已导致生成大量的基因组数据，如UK10K和Million Veteran Project等项目所证明的那样，测序的基因组数量约为10 K至1M。由于来自同一物种的个体的基因组序列之间存在大量冗余，因此与参考序列相比，大多数医学研究处理的是序列中的变异，而不是完整的基因组序列。因此，数百万个代表变异的基因组被存储在数据库中。这些数据库不断更新和查询以提取信息，例如个人或个人组之间的常见变体。用于这种类型的数据库的压缩的先前算法缺乏有效的随机访问能力，使得查询数据库中的特定变体和/或个体效率极低，以至于压缩常常被完全放弃。

著录项

来源
《Bioinformatics》 |2016年第17期|共8页
作者
Tatwawadi Kedar; Hernaez Mikel; Ochoa Idoia; Weissman Tsachy;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类生物工程学（生物技术）;
关键词

相似文献

外文文献
中文文献
专利

1. GTRAC: fast retrieval from compressed collections of genomic variants [J] . Tatwawadi Kedar, Hernaez Mikel, Ochoa Idoia, Bioinformatics . 2016,第17期

机译：GTRAC：从基因组变体的压缩集合中快速检索
2. Faster compressed suffix trees for repetitive collections [J] . M. Sohel Rahman Computing reviews . 2016,第7期

机译：更快的压缩后缀树，用于重复收集
3. (EFM)-F-2: an encrypted and compressed full-text index for collections of genomic sequences [J] . Bioinformatics . 2017,第18期

机译：（efm）-f-2：用于基因组序列集合的加密和压缩的全文索引
4. Retrieval in text collections with historic spelling using linguistic and spelling variants [C] . Andrea Ernst-Gerlach, Norbert Fuhr ACM/IEEE-CS joint conference on Digital libraries . 2007

机译：使用语言和拼写变体在具有历史性拼写的文本集中进行检索
5. Variant-curation and database instantiation (Variant-CADI): An integrated software system for the automation of collection, annotation and management of variations in clinical genetic testing [D] . Hallier, Andrea Rae. 2016

机译：变形 - 策划和数据库实例化（Variant-CADI）：用于临床遗传检测的自动化，注释和管理的自动化集成软件系统
6. GTRAC: fast retrieval from compressed collections of genomic variants [O] . Kedar Tatwawadi, Mikel Hernaez, Idoia Ochoa, -1

机译：GTRAC：从压缩的基因组变体集合中快速检索
7. GTRAC: fast retrieval from compressed collections of genomic variants [O] . Kedar Tatwawadi, Mikel Hernaez, Idoia Ochoa, 2016

机译：GTRAC：从基因组变体的压缩收集时快速检索

GTRAC: fast retrieval from compressed collections of genomic variants

摘要

著录项

相似文献

相关主题

期刊订阅