首页> 外文会议>Extended Semantic Web Conference >A Collection of Benchmark Data Sets for Knowledge Graph-Based Similarity in the Biomedical Domain

【24h】

A Collection of Benchmark Data Sets for Knowledge Graph-Based Similarity in the Biomedical Domain

机译：用于生物医学域中知识图形的相似性的基准数据集的集合

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The ability to compare entities within a knowledge graph is a cornerstone technique for several applications, ranging from the integration of heterogeneous data to machine learning. It is of particular importance in biomedical applications such as prediction of protein-protein interactions, associations between diseases and genes, cellular localization of proteins, among others. However, building a gold standard data set to support their evaluation is non-trivial, due to size, diversity and complexity of biomedical knowledge graphs. We present a collection of 21 benchmark data sets that aim at circumventing the difficulties in building benchmarks for large biomedical knowledge graphs by exploiting proxies for biomedical entity similarity. These data sets include data from two successful biomedical ontologies, the Gene Ontology and the Human Phenotype Ontology, and explore proxy similarities based on protein and gene properties. Data sets have varying sizes and cover four different species at different levels of annotation completion. For each data set we also provide semantic similarity computations with state of the art representative measures.

机译：能够在知识图中比较实体是几个应用的基石技术，从异构数据集成到机器学习。在生物医学应用中特别重要，例如蛋白质 - 蛋白质相互作用，疾病与基因之间的关联，蛋白质的细胞定位等。然而，由于尺寸，多样性和复杂性的生物医学知识图形，构建金标准数据集以支持其评估是非微不足道的。我们提出了21个基准数据集的集合，其目的通过利用生物医学实体相似性的代理来避免建立大型生物医学知识图表的基准。这些数据集包括来自两种成功的生物医学本体，基因本体和人类表型本体的数据，以及基于蛋白质和基因特性的探索代理相似性。数据集具有不同的尺寸并在不同级别的注释完成时覆盖四种不同的物种。对于每个数据集，我们还提供了具有艺术代表措施的状态的语义相似性计算。

著录项

来源
《Extended Semantic Web Conference》|2020年|50-55|共6页
会议地点
作者
Carlota Cardoso; Rita T. Sousa; Sebastian Koehler; Catia Pesquita;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Evolving knowledge graph similarity for supervised learning in complex biomedical domains [J] . Rita T. Sousa, Sara Silva, Catia Pesquita BMC Bioinformatics . 2020,第1期

机译：在复杂生物医学领域的监督学习的发展知识图相似之处
2. Semantic similarity in the biomedical domain: an evaluation across knowledge sources [J] . Vijay N Garla, Cynthia Brandt BMC Bioinformatics . 2012,第1期

机译：生物医学领域的语义相似性：跨知识源的评估
3. Development of a core set of domains for data collection in cohorts of patients with ankylosing spondylitis receiving anti-tumor necrosis factor-alpha therapy. [J] . Zochling J, Sieper J, van-der-Heijde D, The Journal of rheumatology . 2008,第6期

机译：开发了一组核心域，用于在接受抗肿瘤坏死因子-α治疗的强直性脊柱炎患者队列中进行数据收集。
4. Relational Model of Data over Domains with Similarities: An Extension for Similarity Queries and Knowledge Extraction [C] . Radim Belohlavek, Vilem Vychodil IEEE International Conference on Information Reuse and Integration . 2006

机译：具有相似性域的数据的关系模型：相似性查询和知识提取的扩展
5. Data preparation for biomedical knowledge domain visualization: A probabilistic record linkage and information fusion approach to citation data. [D] . Synnestvedt, Marie B. 2007

机译：用于生物医学知识域可视化的数据准备：引用记录的概率记录链接和信息融合方法。
6. A Collection of Benchmark Data Sets for Knowledge Graph-based Similarity in the Biomedical Domain [O] . Carlota Cardoso, Rita T Sousa, Sebastian Köhler, 2020

机译：用于生物医学域中知识图形的相似性的基准数据集的集合
7. Semantic similarity in the biomedical domain: an evaluation across knowledge sources [O] . Vijay N Garla, Cynthia Brandt 2012

机译：生物医学领域的语义相似性：跨知识源的评估
8. KI-LEARN: Knowledge-Intensive Learning Methods for Knowledge-Rich/Data- Poor Domains [R] . Dietterich, T. G. , Restificar, A. , Tadepalli, P. , 2006

机译：KI-LEaRN：知识丰富/数据贫乏领域的知识密集型学习方法

A Collection of Benchmark Data Sets for Knowledge Graph-Based Similarity in the Biomedical Domain

摘要

著录项

相似文献

相关主题

期刊订阅