...
首页> 外文期刊>Journal of chemical information and modeling >Rapid Identification of X-ray Diffraction Patterns Based on Very Limited Data by Interpretable Convolutional Neural Networks
【24h】

Rapid Identification of X-ray Diffraction Patterns Based on Very Limited Data by Interpretable Convolutional Neural Networks

机译:基于可解释的卷积神经网络的基于非常有限的数据,快速识别X射线衍射模式

获取原文
获取原文并翻译 | 示例

摘要

Large volumes of data from material characterizations call for rapid and automatic data analysis to accelerate materials discovery. Herein, we report a convolutional neural network (CNN) that was trained based on theoretical data and very limited experimental data for fast identification of experimental X-ray diffraction (XRD) patterns of metal-organic frameworks (MOFs). To augment the data for training the model, noise was extracted from experimental data and shuffled; then it was merged with the main peaks that were extracted from theoretical spectra to synthesize new spectra. For the first time, one-to-one material identification was achieved. Theoretical MOFs patterns (1012) were augmented to a whole data set of 72 864 samples. It was then randomly shuffled and split into training (58 292 samples) and validation (14 572 samples) data sets at a ratio of 4:1. For the task of discriminating, the optimized model showed the highest identification accuracy of 96.7% for the top 5 ranking on a test data set of 30 hold-out samples. Neighborhood component analysis (NCA) on the experimental XRD samples shows that the samples from the same material are clustered in groups in the NCA map. Analysis on the class activation maps of the last CNN layer further discloses the mechanism by which the CNN model successfully identifies individual MOFs from the XRD patterns. This CNN model trained by the data augmentation technique would not only open numerous potential applications for identifying XRD patterns for different materials, but also pave avenues to autonomously analyze data by other characterization tools such as FTIR, Raman, and NMR spectroscopies.
机译:来自材料特征的大量数据呼吁快速和自动数据分析,以加速材料发现。在此,我们报告了基于理论数据和非常有限的实验数据培训的卷积神经网络(CNN),用于快速识别金属有机框架(MOF)的实验X射线衍射(XRD)图案的快速识别。为了增加培训模型的数据,从实验数据提取噪音并随后换档;然后将其与从理论光谱中提取的主峰合并以合成新光谱。首次实现一对一的材料识别。理论MOF模式(1012)被增强到72 864个样本的整个数据集。然后随机随机洗机并分成训练(58 292个样本)和验证(14 572个样本)的比例为4:1。对于判别的任务,优化模型在30个扑出样本的测试数据集中的排名上显示出最高识别准确度为96.7%。实验XRD样品上的邻域分量分析(NCA)表明,来自相同材料的样品在NCA地图中以组聚集。上一个CNN层的类激活图的分析还公开了CNN模型成功识别来自XRD图案的单个MOF的机制。由数据增强技术训练的该CNN模型不仅可以打开识别不同材料的XRD模式的许多潜在应用,而且还通过其他表征工具(如FTIR,拉曼和NMR光谱)自主分析数据。

著录项

  • 来源
  • 作者单位

    Univ Missouri Dept Mech &

    Aerosp Engn Columbia MO 65211 USA;

    Univ Missouri Dept Mech &

    Aerosp Engn Columbia MO 65211 USA;

    Univ Missouri Dept Mech &

    Aerosp Engn Columbia MO 65211 USA;

    Univ Missouri Dept Mech &

    Aerosp Engn Columbia MO 65211 USA;

    Univ Missouri Dept Elect Engn &

    Comp Sci Columbia MO 65211 USA;

    Univ Missouri Dept Mech &

    Aerosp Engn Columbia MO 65211 USA;

    Univ Missouri Dept Mech &

    Aerosp Engn Dept Elect Engn &

    Comp Sci Columbia MO 65211 USA;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 化学 ; 化学工业 ;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号