This article is a tutorial for PDBj Mine, a new database and its interface for Protein Data Bank Japan (PDBj). In PDBj Mine, data are loaded from files in the PDBMLplus format (an extension of PDBML, PDB's canonical XML format, enriched with annotations), which are then served for the user of PDBj via the worldwide web (WWW). We describe the basic design of the relational database (RDB) and web interfaces of PDBj Mine. The contents of PDBMLplus files are first broken into XPath entities, and these paths and data are indexed in the way that reflects the hierarchical structure of the XML files. The data for each XPath type are saved into the corresponding relational table that is named as the XPath itself. The generation of table definitions from the PDBMLplus XML schema is fully automated. For efficient search, frequently queried terms are compiled into a brief summary table. Casual users can perform simple keyword search, and 'Advanced Search' which can specify various conditions on the entries. More experienced users can query the database using SQL statements which can be constructed in a uniform manner. Thus, PDBj Mine achieves a combination of the flexibility of XML documents and the robustness of the RDB.>Database URL:
展开▼
机译:本文是针对PDBj Mine的教程,该数据库是适用于日本Protein Data Bank(PDBj)的新数据库及其接口。在PDBj Mine中,以PDBMLplus格式(PDBML的扩展,PDB的规范XML格式,带有注释)从文件中加载数据,然后通过万维网(WWW)为PDBj的用户提供服务。我们描述了PDBj Mine的关系数据库(RDB)和Web界面的基本设计。首先将PDBMLplus文件的内容分解为XPath实体,然后以反映XML文件的层次结构的方式对这些路径和数据建立索引。每种XPath类型的数据都保存到名为XPath本身的对应关系表中。从PDBMLplus XML模式生成表定义是完全自动化的。为了有效地进行搜索,将经常查询的术语汇编到一个简短的摘要表中。临时用户可以执行简单的关键字搜索,而“高级搜索”可以在条目上指定各种条件。更有经验的用户可以使用可以统一构造的SQL语句查询数据库。因此,PDBj Mine实现了XML文档的灵活性和RDB的强大功能的组合。>数据库URL strong>:
展开▼