首页> 外文会议>International Joint Conference on Biomedical Engineering Systems and Technologies >Construct Semantic Type of 'Gene-mutation-disease' Relation by Computer-aided Curation from Biomedical Literature
【24h】

Construct Semantic Type of 'Gene-mutation-disease' Relation by Computer-aided Curation from Biomedical Literature

机译:通过生物医学文献的计算机辅助策策构建“基因突变疾病”的语义类型

获取原文

摘要

Background: Current semantic type of "gene-mutation-disease" relation lacks fine-grained classification and corresponding relation signal words, which limits its usage in relation extraction from biomedical literature using text mining approach. Methods: We propose a computer-aided curation pipeline in which open relation extraction, signal word clustering, relation type mapping are used to analyze biomedical abstracts for semantic type of "gene-mutation-disease" construction. Coverage metrics are used to evaluate the defined relation type while ClinVar is chosen as a target to test our semantic type's usability and performance on guiding relation extraction from biomedical literature. Results: We have constructed a 5-layer and 16-category semantic type of "gene-mutation-disease" relation with a vocabulary list containing 58 commonly used relation signal words. The vocabulary list has coverage of 95.08% and the semantic type has coverage of 94.12%. From 25 abstracts linked to 30 ClinVar records, 15 relations are correctly mapped and 8 novel relations are discovered additionally. Conclusion: The results show that our semantic type can cover the main relations between "gene", "mutation" and "disease" and can achieve good performance on guiding relation extraction from biomedical text even using relatively out-of-date dictionary-based text mining methods.
机译:背景技术“基因 - 突变疾病”关系的目录语义类型缺乏细粒度的分类和相应的关系信号词,其利用文本采矿方法限制了其对生物医学文献的关系中的使用。方法:我们提出了一种计算机辅助策择流水线,其中开放关系提取,信号字聚类,关系类型映射用于分析语义类型的“基因突变疾病”构建的生物医学摘要。覆盖度量标准用于评估所定义的关系类型,而ClinVar被选为测试我们的语义类型的可用性和性能,从生物医学文献引导相关性。结果:我们建造了一个5层和16类语义类型的“基因突变疾病”关系,其中包含58个常用关系信号词的词汇表。词汇表的覆盖率为95.08%,语义类型的覆盖率为94.12%。从连接到30个Clarvar记录的25个摘要,正确映射了15个关系,另外发现了8个新的关系。结论:结果表明,我们的语义类型可以涵盖“基因”,“突变”和“疾病”之间的主要关系,并且甚至可以使用基于相对犹豫的文本的生物医学文本的引导关系提取良好的性能采矿方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号