首页> 美国卫生研究院文献>Journal of Cheminformatics >Using SMILES strings for the description of chemical connectivity in the Crystallography Open Database
【2h】

Using SMILES strings for the description of chemical connectivity in the Crystallography Open Database

机译:在结晶学开放数据库中使用SMILES字符串描述化学连接性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Computer descriptions of chemical molecular connectivity are necessary for searching chemical databases and for predicting chemical properties from molecular structure. In this article, the ongoing work to describe the chemical connectivity of entries contained in the Crystallography Open Database (COD) in SMILES format is reported. This collection of SMILES is publicly available for chemical (substructure) search or for any other purpose on an open-access basis, as is the COD itself. The conventions that have been followed for the representation of compounds that do not fit into the valence bond theory are outlined for the most frequently found cases. The procedure for getting the SMILES out of the CIF files starts with checking whether the atoms in the asymmetric unit are a chemically acceptable image of the compound. When they are not (molecule in a symmetry element, disorder, polymeric species,etc.), the previously published cif_molecule program is used to get such image in many cases. The program package Open Babel is then applied to get SMILES strings from the CIF files (either those directly taken from the COD or those produced by cif_molecule when applicable). The results are then checked and/or fixed by a human editor, in a computer-aided task that at present still consumes a great deal of human time. Even if the procedure still needs to be improved to make it more automatic (and hence faster), it has already yielded more than 160,000 curated chemical structures and the purpose of this article is to announce the existence of this work to the chemical community as well as to spread the use of its results.Electronic supplementary materialThe online version of this article (10.1186/s13321-018-0279-6) contains supplementary material, which is available to authorized users.
机译:化学分子连通性的计算机描述对于搜索化学数据库和从分子结构预测化学性质是必要的。本文中,正在进行的工作以SMILES格式描述了结晶学开放数据库(COD)中包含的条目的化学连通性。 SMILES的此集合可公开获取,以进行化学(子结构)搜索或用于任何其他目的(以COD本身为基础)。对于最常见的情况,概述了不符合价键理论的化合物表示所遵循的惯例。从CIF文件中获取SMILES的过程始于检查不对称单元中的原子是否为该化合物的化学上可接受的图像。当它们不存在时(对称元素中的分子,无序,聚合物种类等),在许多情况下,使用先前发布的cif_molecule程序可获得此类图像。然后应用程序包Open Babel从CIF文件中获取SMILES字符串(可直接从COD中获取或由cif_molecule生成的字符串)。然后由人工编辑器在计算机辅助任务中检查和/或修复结果,该任务目前仍占用大量人工时间。即使仍然需要改进该程序以使其自动化(从而更快),它也已经产生了160,000种经过整理的化学结构,并且本文的目的是向化学界宣布这项工作的存在。电子补充材料本文的在线版本(10.1186 / s13321-018-0279-6)包含补充材料,可供授权用户使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号