Genome data modeling and data compression.

机译：基因组数据建模和数据压缩。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Genome data modeling is an important area of research and different data models have been proposed for representing and storing data. Some of the challenges in biological data management are data storage, retrieval, data redundancy, and data integrity. In this thesis we propose two data models for representing and storing genome sequence data. In these models we propose that, instead of storing the whole gene sequence for each gene separately, we store common sub sequences only once, with a sequence ID or GenBank identification number. We also store the position number, so that the whole sequence can be retrieved correctly. This would significantly reduce storage space requirements and help maintain data integrity. In our second model a pre-coding routine is also included to further reduce storage requirements. A study of randomness in genome data is also included. Both data models were tested and the results were satisfactory. We were able to compress the sequence, when there was significant amount of commonality, and the retrieval algorithm was able to retrieve the sequence correctly.

机译：基因组数据建模是研究的重要领域，并且已经提出了用于表示和存储数据的不同数据模型。生物数据管理中的一些挑战是数据存储，检索，数据冗余和数据完整性。在本文中，我们提出了两个用于表示和存储基因组序列数据的数据模型。在这些模型中，我们提出，与其将每个基因的整个基因序列分别存储，不如将一个具有序列ID或GenBank标识号的公共子序列存储一次。我们还存储位置编号，以便可以正确检索整个序列。这将大大减少存储空间需求，并有助于保持数据完整性。在我们的第二个模型中，还包括预编码例程，以进一步减少存储需求。还包括对基因组数据随机性的研究。两种数据模型均经过测试，结果令人满意。当存在大量通用性时，我们能够压缩序列，并且检索算法能够正确检索序列。

著录项

作者
Radhakrishnan, Radhika.;
展开▼
作者单位

University of Nevada, Reno.;

展开▼
授予单位 University of Nevada, Reno.;
学科 Computer science.;Bioinformatics.
学位 M.S.
年度 2007
页码 48 p.
总页数 48
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. P1015 The Caprinae Genome Database: Multispecies goats/sheep genome and incorporation of RNA-Seq data, and re-sequencing data to study comparative genomics and genome assistant breeding. [J] . Su?R. Journal of animal science . 2016 ,第supplement4期

机译：P1015 Caprinae基因组数据库：多物种山羊/绵羊基因组，并结合RNA-Seq数据，并对数据进行重新测序，以研究比较基因组学和基因组辅助育种。
2. P1015 The Caprinae Genome Database: Multispecies goats/sheep genome and incorporation of RNA-Seq data, and re-sequencing data to study comparative genomics and genome assistant breeding. [J] . Su?R. Journal of animal science . 2016 ,第supplement4期

机译：P1015 Caprinae基因组数据库：多物种山羊/绵羊基因组，并结合RNA-Seq数据，并对数据进行重新测序，以研究比较基因组学和基因组辅助育种。
3. Data compression. data fusion and Kalman filtering in wavelet packet sub-bands of a multisensor tracking system [J] . K. M. Wong, Z.Q. Luo, Q. Jin IEE proceedings. Radar, sonar and navigation . 1998 ,第2期

机译：数据压缩。多传感器跟踪系统的小波包子带中的数据融合和卡尔曼滤波
4. Wiki-Genome: A model-driven genome data management environment [C] . Ferrandis Martinez, Maria Ana Research Challenges in Information Science (RCIS), 2012 Sixth International Conference on . 2012

机译：Wiki-Genome：模型驱动的基因组数据管理环境
5. Big Data on Small Organisms: Data Integration and Genome-scale Multi-omics Predictive Modeling for Microbial Species [D] . Kim, Minseung 2018

机译：小型生物的大数据：微生物物种的数据集成与基因组多OMICS预测模型
6. MEMOSys 2.0: an update of the bioinformatics database for genome-scale models and genomic data [O] . Stephan Pabinger, Rene Snajder, Timo Hardiman, 2014

机译：MEMOSys 2.0：用于基因组规模模型和基因组数据的生物信息学数据库的更新
7. Computational Model for Tumor Oxygenation Applied to Clinical Data on Breast Tumor Hemoglobin Concentrations Suggests Vascular Dilatation and Compression. [O] . Michael Welter, Thierry Fredrich, Herbert Rinneberg, 2016

机译：肿瘤氧合的计算模型应用于乳腺肿瘤血红蛋白浓度的临床数据表明血管扩张和压缩。

Genome data modeling and data compression.

摘要

著录项

相似文献

相关主题

期刊订阅