首页> 外文学位 >Enhanced bioinformatics data modeling concepts and their use in querying and integration.
【24h】

Enhanced bioinformatics data modeling concepts and their use in querying and integration.

机译:增强的生物信息学数据建模概念及其在查询和集成中的使用。

获取原文
获取原文并翻译 | 示例

摘要

In bioinformatics research, scientists usually face the problems of modeling complex data types and integrating diverse resources. Traditional data models such as EER lack the expressing power to capture many characteristics that are common in bioinformatics data. We first propose extensions to the ER model that allow accurate representation of many of these characteristics. We then utilize these concepts in an integrative system to provide an easy-to-use interface for biologists to construct queries. Our research utilizes the enhanced conceptual modeling concepts to create a prototype mediator for querying multiple data sources. The various relationships between different biological entities are all semantically represented as domain ontologies stored in the mediator for experts to analyze and correlate the integrated query results. The following research has been conducted: (1) We first propose new EER schema notation to represent the common occurring biological concepts: the ordering properties of the DNA sequences, the 3D structure of proteins and the functional processes of metabolic pathways. (2) Then, we utilize these new relationships in the development of the mediated domain ontology, which helps the interface design and query processor implementation of our mediator system.;Our mediated schema features are based on a hybrid of taxonomy ontologies (core concepts and external classification/annotation concepts) for interpretation of raw data sets (protein and gene sequences) in the context of molecular interactions, biochemical pathways and biological processes. We adopt the RDF data model to implement the mediation data. Our mediator mainly takes a browsing-based approach to integrate different data sources. Extra data can be dynamically retrieved through the web service. By browsing the ontology tree in the query interface, users can select concepts of interest and associated attributes to formulate queries based on their domain knowledge. The query result is a set of various database entry accessions with associated attribute values. Users can click each link of the accessions to see the detailed reports, or cross-compare attributes of these data instances. Query usability and performance experiments are tested for real data sets from UniProt [30], ENZYME [8], CATH [23], and GO [29].
机译:在生物信息学研究中,科学家通常面临建模复杂数据类型和整合各种资源的问题。传统数据模型(例如EER)缺乏捕获生物信息学数据中常见特征的表达能力。我们首先提出对ER模型的扩展,以允许准确地表示许多这些特征。然后,我们在集成系统中利用这些概念,为生物学家构建查询提供了易于使用的界面。我们的研究利用增强的概念建模概念来创建用于查询多个数据源的原型中介器。不同生物学实体之间的各种关系都在语义上表示为存储在介体中的领域本体,以供专家分析和关联集成的查询结果。已经进行了以下研究:(1)我们首先提出新的EER方案表示法来代表常见的生物学概念:DNA序列的有序特性,蛋白质的3D结构和代谢途径的功能过程。 (2)然后,我们在中介域本体的开发中利用这些新的关系,这有助于介体系统的接口设计和查询处理器的实现。;我们的中介模式功能基于分类本体的混合(核心概念和外部分类/注释概念)来解释分子相互作用,生化途径和生物学过程中的原始数据集(蛋白质和基因序列)。我们采用RDF数据模型来实现中介数据。我们的调解员主要采用基于浏览的方法来集成不同的数据源。可以通过Web服务动态检索额外的数据。通过在查询界面中浏览本体树,用户可以选择感兴趣的概念和关联的属性,以基于其领域知识来制定查询。查询结果是一组具有相关属性值的各种数据库条目。用户可以单击部件的每个链接以查看详细的报告,也可以交叉比较这些数据实例的属性。针对UniProt [30],ENZYME [8],CATH [23]和GO [29]中的真实数据集测试了查询的可用性和性能实验。

著录项

  • 作者

    Ji, Feng.;

  • 作者单位

    The University of Texas at Arlington.;

  • 授予单位 The University of Texas at Arlington.;
  • 学科 Biology Bioinformatics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 114 p.
  • 总页数 114
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号