SemNet: Using Local Features to Navigate the Biomedical Concept Graph

Andrew R. Sedler; Cassie S. Mitchell

首页> 外文期刊>Frontiers in Bioengineering and Biotechnology >SemNet: Using Local Features to Navigate the Biomedical Concept Graph

【24h】

SemNet: Using Local Features to Navigate the Biomedical Concept Graph

机译：SemNet：使用本地功能导航生物医学概念图

获取原文

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Literature-Based Discovery (LBD) aims to connect scientists across silos by assembling models of the literature to reveal previously hidden connections. Unfortunately, LBD systems have been unable to achieve user adoption on a large scale. This work develops opens source software in Python to convert a database of semantic predications of all of PubMed’s 27.9 million indexed abstracts into a semantic inference network and biomedical concept graph in Neo4j. The developed software, called SemNet, queries a modified version of the publicly available SemMedDB and computes feature vectors on source-target pairs. Each unique United Medical Language System (UMLS) concept is represented as a node and each predication as an edge. Each node is assigned one of 132 node labels (e.g. Amino Acid, Peptide, or Protein (AAPP); Gene or Genome (GG); etc.) and each edge is labeled with one of 58 predications (e.g. treats, causes, inhibits, etc.). SemNet computes a single feature value for each metapath, or sequence of node types, between a source node and user-specified target node(s). Several different types of metapath-based features (count, degree weighted path count, and HeteSim metric) are computed and vectorized. SemNet employs an unsupervised learning algorithm for rank aggregation (ULARA) to rank identified source nodes that are most relevant to the user-specified target nodes(s). Statistical analysis of correlation among identified source nodes or resultant literature network features are used to identify patterns that can guide future research. Analysis of high residual nodes is used to compare and contrast SemNet rankings between different targets of interest. An example SemNet use case is presented to assess “the differential impact of smoking on cognition in males and females” using the following target nodes: nicotine, learning, memory, tetrahydrocannabinol (THC), cigarette smoke, X chromosome, and Y chromosome. Detailed rankings are discussed. Overall results suggest a hypothesis where smoking negatively impacts cognition to a greater extent in females, but smoking has stronger cardiovascular impacts in males. In summary, SemNet provides an adoptable method for efficient LBD of PubMed that extends beyond omics-only relationships to true multi-scalar connections that can provide actionable insight for predictive medicine, research prioritization, and clinical care.

机译：基于文献的发现（LBD）旨在通过组装文献模型来揭示以前隐藏的联系，从而将科学家跨各个孤岛联系起来。不幸的是，LBD系统无法大规模实现用户采用。这项工作开发了Python开源软件，可以将PubMed 2790万索引摘要的语义谓词数据库转换为Neo4j中的语义推断网络和生物医学概念图。已开发的名为SemNet的软件可以查询公开可用的SemMedDB的修改版本，并计算源-目标对上的特征向量。每个独特的联合医学语言系统（UMLS）概念都表示为一个节点，每个谓词均表示为边缘。每个节点都分配了132个节点标签之一（例如氨基酸，肽或蛋白质（AAPP）;基因或基因组（GG）;等等），每个边缘都标记有58种谓词之一（例如治疗，原因，抑制，等等。）。 SemNet为源节点和用户指定的目标节点之间的每个元路径或节点类型序列计算单个特征值。计算并矢量化了几种不同类型的基于Metapath的功能（计数，度加权路径计数和HeteSim度量）。 SemNet使用无监督学习算法进行排名聚合（ULARA），对与用户指定的目标节点最相关的已标识源节点进行排名。对确定的源节点之间的相关性或所得文献网络特征之间的相关性进行统计分析，以识别可指导未来研究的模式。高残留节点的分析用于比较和对比不同目标对象之间的SemNet排名。提出了一个示例SemNet用例，它使用以下目标节点评估“吸烟对男性和女性认知的不同影响”：尼古丁，学习，记忆，四氢大麻酚（THC），香烟烟雾，X染色体和Y染色体。讨论了详细的排名。总体结果表明，吸烟对女性认知的负面影响更大，但男性吸烟对心血管的影响更大。总而言之，SemNet为有效的PubMed LBD提供了一种可采用的方法，该方法从单纯的组学关系扩展到真正的多标量连接，可以为预测医学，研究优先级和临床护理提供可行的见解。

著录项

来源
《Frontiers in Bioengineering and Biotechnology》 |2019年第2期|共19页
作者
Andrew R. Sedler; Cassie S. Mitchell;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类生物工程学（生物技术）;
关键词
Literature Based Discoveryknowledge graphtext miningunsupervised learningPython (programming language)literature mapliterature mining;

机译：基于文学的发现知识图文本挖掘无监督学习Python（编程语言）文学地图文学挖掘;

相似文献

外文文献
中文文献
专利

1. Using contextual and lexical features to restructure and validate the classification of biomedical concepts [J] . Jung-Wei Fan, Hua Xu, Carol Friedman BMC Bioinformatics . 2007,第1期

机译：使用上下文和词汇特征来重构和验证生物医学概念的分类
2. Using contextual and lexical features to restructure and validate the classification of biomedical concepts [J] . Jung-Wei Fan, Hua Xu, Carol Friedman BMC Bioinformatics . 2007,第1期

机译：使用上下文和词汇特征来重构和验证生物医学概念的分类
3. Navigating features: a topologically informed chart of electromyographic features space [J] . Phinyomark Angkoon, Khushaba Rami N., Ibanez-Marcelo Esther, Journal of the Royal Society Interface . 2017,第137期

机译：导航功能：拓扑上通知的电拍照特征空间
4. An approximately complete string representation of local object boundary features for concept-based biomedical image retrieval [C] . Ghebreab, S., Smeulders, . 2004

机译：基于概念的生物医学图像检索的局部对象边界特征的近似完整字符串表示形式
5. A conceptual graph feature model for use in developing software product-lines. [D] . Bachmeyer, Randall. 2008

机译：用于开发软件产品线的概念图特征模型。
6. SemNet: Using Local Features to Navigate the Biomedical Concept Graph [O] . Andrew R. Sedler, Cassie S. Mitchell 2019

机译：SemNet：使用本地功能导航生物医学概念图
7. Navigating features: a topologically informed chart of electromyographic features space [O] . Angkoon Phinyomark, Rami N. Khushaba, Esther Ibáñez-Marcelo, 2017

机译：导航功能：拓扑上通知的电拍摄特征空间图表

SemNet: Using Local Features to Navigate the Biomedical Concept Graph

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅