首页> 外文期刊>Bioinformatics >Characterization of H-1 NMR spectroscopic data and the generation of synthetic validation sets
【24h】

Characterization of H-1 NMR spectroscopic data and the generation of synthetic validation sets

机译:H-1 NMR光谱数据的表征和合成验证集的生成

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: Common contemporary practice within the nuclear magnetic resonance (NMR) metabolomics community is to evaluate and validate novel algorithms on empirical data or simplified simulated data. Empirical data captures the complex characteristics of experimental data, but the optimal or most correct analysis is unknown a priori; therefore, researchers are forced to rely on indirect performance metrics, which are of limited value. In order to achieve fair and complete analysis of competing techniques more exacting metrics are required. Thus, metabolomics researchers often evaluate their algorithms on simplified simulated data with a known answer. Unfortunately, the conclusions obtained on simulated data are only of value if the data sets are complex enough for results to generalize to true experimental data. Ideally, synthetic data should be indistinguishable from empirical data, yet retain a known best analysis.Results: We have developed a technique for creating realistic synthetic metabolomics validation sets based on NMR spectroscopic data. The validation sets are developed by characterizing the salient distributions in sets of empirical spectroscopic data. Using this technique, several validation sets are constructed with a variety of characteristics present in 'real' data. A case study is then presented to compare the relative accuracy of several alignment algorithms using the increased precision afforded by these synthetic data sets.
机译:动机:核磁共振(NMR)代谢组学领域内的常见当代实践是评估和验证基于经验数据或简化模拟数据的新颖算法。经验数据捕获了实验数据的复杂特征,但是最优或最正确的分析是先验未知的。因此,研究人员被迫依靠间接绩效指标,这些指标的价值有限。为了公平,完整地分析竞争技术,需要更严格的指标。因此,代谢组学研究人员经常在具有已知答案的简化模拟数据上评估其算法。不幸的是,只有在数据集足够复杂以至于将结果推广到真实的实验数据时,对模拟数据得出的结论才有价值。理想情况下,合成数据应该与经验数据没有区别,同时保留已知的最佳分析结果。结果:我们开发了一种基于NMR光谱数据创建现实的合成代谢组学验证集的技术。通过表征经验光谱数据集中的显着分布来开发验证集。使用这种技术,可以构建多个具有“真实”数据中存在的各种特征的验证集。然后提出一个案例研究,使用这些合成数据集提供的提高的精度来比较几种对齐算法的相对精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号