Biological entities are strongly related and mutually dependent on each other. Therefore, there is a growing need to corroborate and integrate data from different resources and aspects of biological systems in order to analyze them effectively. To identify entities, existing databases use explicit references by accession number or a mutual ontology. Some databases relate and cross link elements from other databases based on these identifiers. However, this information is very partial and is not readily available in some. Moreover, these links are not established in coordination with the other linked databases. With the source databases changing rapidly, this leads to problems of consistency and updatability. Furthermore, it is hard to query this wealth of data in ways that can benefit and exploit the mutual dependency between entities. Biozon is a unified biological database that integrates heterogeneous data types and the relationships between them, such as nucleic acid sequences, proteins, structures, protein domains and protein families, protein-protein interactions and cellular pathways, into a single extensive schema. This schema allows one to see each data instance in its full biological context. More importantly it allows for complex searches that span multiple data types from a heterogeneous set of sources and for arbitrary computations on that data. Biozon can also rank results, the same way Google ranks web documents, and uses similarity relationships to extend query results to similar biological entities.
展开▼