首页> 外国专利> System and method for the indexing of organic chemical structures mined from text documents

System and method for the indexing of organic chemical structures mined from text documents

机译:从文本文档中提取有机化学结构的索引系统和方法

摘要

Disclosed is a method, a computer program product and a system for processing documents that contain chemical names. The system has a unit to partition document text and to assign semantic meaning to words; a unit to recognize any substructures present in the chemical name fragments; and a unit to determine structural connectivity information of the chemical name fragments and recognized substructures and to store the determined structural connectivity information in a searchable index. The system further includes a unit to search a text index using at least one of a fragment name and a substructure name and to search the structure index by at least one of fragment connectivity and substructure connectivity. At an intersection of the search results from the structure index and the text index, the system operates to identify at least one document that contains a reference to a corresponding chemical compound.
机译:公开了一种用于处理包含化学名称的文档的方法,计算机程序产品和系统。该系统具有对文档文本进行分区并为单词分配语义的单元。识别化学名称片段中存在的任何亚结构的单元;确定所述化学名称片段和识别的亚结构的结构连接性信息并将确定的结构连接性信息存储在可搜索索引中的单元。该系统还包括单元,其使用片段名称和子结构名称中的至少一个来搜索文本索引,并且通过片段连接性和子结构连接性中的至少一个来搜索结构索引。在来自结构索引和文本索引的搜索结果的交集处,系统操作以识别至少一个文档,该文档包含对相应化合物的引用。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号