首页> 外文会议>2011 IEEE 27th International Conference on data Engineering Workshops >A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature
【24h】

A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature

机译:科学文献中与蛋白质相关的缩写词的半自动识别,歧义消除和存储的框架

获取原文

摘要

We propose a framework for identifying, disambiguating and storing protein-related abbreviations as found in the full texts of scientific papers, in order to build and maintain a publicly available abbreviation repository via a semi-automatic process. This process involves information extraction methods and techniques for acronym identification and resolution, based on lexical clues and syntactical, largely domain-independent criteria. A dictionary and an ontology for proteins provide the means for matching and disambiguating the biological entities. User feedback is gathered at the end of the process and the confirmed entries are then stored and made available to the scientific community for further reviewing.
机译:我们提出了一种用于识别,消除歧义和存储科学论文全文中发现的蛋白质相关缩写的框架,以便通过半自动过程来建立和维护公开可用的缩写存储库。此过程涉及基于词汇线索和句法,很大程度上与领域无关的标准的信息提取方法和用于缩写词识别和解析的技术。蛋白质的字典和本体提供了匹配和消除生物学实体歧义的方法。在过程结束时收集用户反馈,然后将确认的条目存储起来,并提供给科学界以供进一步检查。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号