首页> 外文会议>International Conference on Data Engineering Workshops >A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature
【24h】

A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature

机译:科学文献中蛋白质相关缩写的半自动鉴定,消歧和储存框架

获取原文

摘要

We propose a framework for identifying, disambiguating and storing protein-related abbreviations as found in the full texts of scientific papers, in order to build and maintain a publicly available abbreviation repository via a semi-automatic process. This process involves information extraction methods and techniques for acronym identification and resolution, based on lexical clues and syntactical, largely domain-independent criteria. A dictionary and an ontology for proteins provide the means for matching and disambiguating the biological entities. User feedback is gathered at the end of the process and the confirmed entries are then stored and made available to the scientific community for further reviewing.
机译:我们提出了一个框架,用于识别,消除和存储与科学论文的全文中发现的蛋白质相关的缩写,以便通过半自动过程构建和维护公开的缩写存储库。该过程涉及基于词汇线索和句法,主要是域的识别和分辨率的信息提取方法和技术。蛋白质的词典和本体提供了用于匹配和消除生物实体的手段。用户反馈在过程结束时收集,然后将确认的条目存储并提供给科学界以进一步审查。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号