首页> 外文会议>IEEE International Conference on e-Science >Generating knowledge networks from phenotypic descriptions
【24h】

Generating knowledge networks from phenotypic descriptions

机译:从表型描述生成知识网络

获取原文

摘要

Several computing systems rely on information about living beings, such as Identification Keys — artifacts created by biologists to identify specimens following a flow of questions about their observable characters (phenotype). These questions are described in a free-text format, e.g., “big and black eye”. Free-texts hamper the automatic information interpretation by machines, limiting their ability to perform search and comparison of terms, as well as integration tasks. This paper proposes a method to extract phenotypic information from natural language texts from biology legacy information systems, transforming them in an Entity-Quality formalism — a format to represent each phenotype character (Entity) and its state (Quality). Our approach aligns automatically recognized. Entities and Qualities with domain concepts described in ontologies. It adopts existing Natural Language Processing techniques, adding an extra original step, which exploits intrinsic characteristics of phenotypic descriptions and of the organizational structure of Identification Keys. The approach was validated over the FishBase data. We conducted extensive experiments based on a manually annotated Gold Standard set to assess the precision and applicability of the proposed extraction method. The obtained results reveal the feasibility of our technique, its benefits and possibilities of scientific studies using the extracted knowledge network.
机译:几种计算系统依赖于有关生物的信息,例如“识别码”(Identification Keys),这是由生物学家创建的人工制品,用于根据有关其可观察特征(表型)的一系列问题来识别标本。这些问题以自由文本格式描述,例如“黑眼睛”。自由文本妨碍了机器的自动信息解释,从而限制了它们执行术语搜索和比较以及集成任务的能力。本文提出了一种从生物学遗留信息系统的自然语言文本中提取表型信息的方法,并将其转换为实体质量形式主义—一种表示每个表型字符(实体)及其状态(质量)的格式。我们的方法会自动识别对齐。具有本体中描述的领域概念的实体和质量。它采用了现有的自然语言处理技术,增加了一个额外的原始步骤,该步骤利用了表型描述的固有特征以及“识别码”的组织结构。该方法已通过FishBase数据进行了验证。我们基于人工注释的黄金标准集进行了广泛的实验,以评估所提出提取方法的精度和适用性。获得的结果揭示了我们技术的可行性,其益处以及使用提取的知识网络进行科学研究的可能性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号