...
首页> 外文期刊>BMC Bioinformatics >Validating module network learning algorithms using simulated data
【24h】

Validating module network learning algorithms using simulated data

机译:使用模拟数据验证模块网络学习算法

获取原文

摘要

Background In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Despite the demonstrated success of such algorithms in uncovering biologically relevant regulatory relations, further developments in the area are hampered by a lack of tools to compare the performance of alternative module network learning strategies. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Results Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators. Conclusion We show that data simulators such as SynTReN are very well suited for the purpose of developing, testing and improving module network algorithms. We used SynTReN data to develop and test an alternative module network learning strategy, which is incorporated in the software package LeMoNe, and we provide evidence that this alternative strategy has several advantages with respect to existing methods.
机译:背景技术近年来,几位作者使用概率图形模型从基因表达数据中学习表达模块及其调控程序。尽管已证明这种算法在揭示生物学上相关的调节关系方面取得了成功,但该领域的进一步发展由于缺乏可比较替代模块网络学习策略性能的工具而受到阻碍。在这里,我们演示了使用综合数据生成器SynTReN来测试和比较模块网络学习算法的目的。我们推出了用于学习模块网络的软件包LeMoNe,该软件包结合了用于学习监管程序的新颖策略。新奇之处包括使用自下而上的贝叶斯层次聚类来构建监管程序,以及使用条件熵测度将监管人分配给监管程序节点。使用SynTReN数据,我们在完全受控的情况下测试了LeMoNe的性能,并评估了针对现有软件包Genomica进行的方法更改的效果。此外,我们评估了各种参数对推理性能的影响,例如数据集的大小和噪声量。结果总体而言,将Genomica和LeMoNe应用于模拟数据集可得出可比较的结果。但是,LeMoNe提供了一些优势,其中之一是对于较大的数据集,学习过程要快得多。此外,我们表明,LeMoNe监管计划中监管机构的位置及其条件熵可用于对监管机构进行功能验证的优先顺序,并且自下而上的聚类策略与基于条件熵的监管机构分配相结合处理丢失或隐藏的监管机构。结论我们证明,诸如SynTReN之类的数据模拟器非常适合用于开发,测试和改进模块网络算法的目的。我们使用SynTReN数据来开发和测试替代模块网络学习策略,该策略已集成到软件包LeMoNe中,并且我们提供的证据表明,该替代策略相对于现有方法具有多个优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号