Metabolites were often matched manually according to their chemical shifts in HMDB(human metabolome database)after NMR(Nuclear Magnetic Resonance)metabolomics detection,but this method had a low accuracy for metabolites matching. A more reasonable method was explored in this paper,which could find peaks and match metabolites automatically. A new method was set up based on Jaccard coefficient and match radio(total number of matched peaks/total number of peaks)according peak matching method of HMDB(human metabolome database),realized by MATLAB and compared with HMDB 1D-NMR search method utilized a random chemical shifts values. For the same random chemical shifts values,there were sixty percent in the top 20 of HMDB matched metabolites which peak numbers was over 16. The HMDB method had biases to metabolites with more peaks;the HMDB matching score showed large variance compared with matching radio,but our method matched pretty well. The tenth matching metab-olite of HMDB had no any chemical shift matched with the giving chemical shift values. The deficiency of HMDB matching method was preliminarily proven in this paper;the precision of NMR metabolites identification could be improved according to metabolites database based on Jaccard coefficient matching algorithm.%NMR代谢组学检测完成后,人们通常基于化学位移值在人类代谢组学数据库(human metabolome data-base,HMDB)上进行手动代谢物匹配,然而该方法对代谢物的鉴定较为粗糙,准确度不高.本研究试图基于建立一种更加合理,且能够自动寻峰并根据数据库匹配代谢物方法.通过分析HMDB的峰匹配方法,提出了基于Jaccard系数和匹配率(匹配的峰数目/总峰数)的新方法,基于MATLAB编程实现,然后比较HMDB中1D NMR search和本方法对于同一段随机化学位移列表的匹配结果.分析结果显示,对于同一随机化学位移列表,HMDB的匹配结果中排在前20位的物质峰数目超过16的占60%,说明其匹配方法偏向于峰数目较多的物质;HMDB用于峰匹配排序的评分与峰匹配率有明显区别,而本方法匹配评分与匹配率较为接近;且HMDB匹配结果排在第10位的物质与该随机序列没有可匹配的化学位移值.本文对于HMDB峰匹配算法存在的不足进行了改进,并发现基于Jaccard分数的匹配算法能够提高根据代谢物数据库进行NMR代谢物鉴定的精度.
展开▼