首页> 外国专利> METHOD AND APPARATUS OF PROTEIN NAME NORMALIZATION USING ONTOLOGY MAPPING

METHOD AND APPARATUS OF PROTEIN NAME NORMALIZATION USING ONTOLOGY MAPPING

机译:利用本体映射对蛋白质名称进行标准化的方法和装置

摘要

It is arranged to accurately identify the protein write in literature using a kind of method and apparatus of the protein name normalization of Ontology mapping, enters a standardization protein ontology by being plotted in the protein name identified in the literature of biology. One literature identification division (110) inputs one protein name of extraction and species data by receiving the literature of biology. One brief dictionary DB (130) includes to a brief protein name and a urporotein title. One brief protein name recovered part (120) restores brief protein name and enters urporotein title. One synonym dictionary DB (150) is constructed by ontology. One inverted index structure DB (160) of synonym dictionary has an inverted index structure of synonym dictionary. One protein code analysis part (140) passes through the similarity analysis protein code for calculating protein code, by comparing the protein name and inverted index structure DB of the extraction of synonym dictionary. One ontology ID of distribution portion (190) distributes last ontology IDs by the protein name based on protein code and kind.
机译:使用本体映射的蛋白质名标准化的一种方法和设备来准确识别文献中所写的蛋白质,通过绘制在生物学文献中鉴定的蛋白质名中,进入标准化的蛋白质本体。一个文献识别部门(110)通过接收生物学文献输入一种提取的蛋白质名称和物种数据。一个简短的字典DB(130)包括一个简短的蛋白质名称和一个乌泊替丁标题。一个简短的蛋白质名称恢复部分(120)恢复简短的蛋白质名称并输入urporotein标题。通过本体构建一个同义词词典DB(150)。同义词词典的一个倒排索引结构DB(160)具有同义词词典的倒排索引结构。一个蛋白质代码分析部分(140)通过比较同义词字典提取的蛋白质名称和倒排索引结构DB,通过相似性分析蛋白质代码来计算蛋白质代码。分发部分的一个本体ID(190)基于蛋白质代码和种类,通过蛋白质名称来分布最后的本体ID。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号