...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >A probabilistic model for mining labeled ordered trees: capturing patterns in carbohydrate sugar chains
【24h】

A probabilistic model for mining labeled ordered trees: capturing patterns in carbohydrate sugar chains

机译:挖掘标记有序树的概率模型:捕获碳水化合物糖链中的模式

获取原文
获取原文并翻译 | 示例
           

摘要

Glycans, or carbohydrate sugar chains, which play a number of important roles in the development and functioning of multicellular organisms, can be regarded as labeled ordered trees. A recent increase in the documentation of glycan structures, especially in the form of database curation, has made mining glycans important for the understanding of living cells. We propose a probabilistic model for mining labeled ordered trees, and we further present an efficient learning algorithm for this model, based on an EM algorithm. The time and space complexities of this algorithm are rather favorable, falling within the practical limits set by a variety of existing probabilistic models, including stochastic context-free grammars. Experimental results have shown that, in a supervised problem setting, the proposed method outperformed five other competing methods by a statistically significant factor in all cases. We further applied the proposed method to aligning multiple glycan trees, and we detected biologically significant common subtrees in these alignments where the trees are automatically classified into subtypes already known in glycobiology.
机译:聚糖或碳水化合物糖链在多细胞生物的发育和功能中起着许多重要作用,可以被认为是标记的有序树。聚糖结构文档的最新增加,特别是以数据库管理的形式,使得挖掘聚糖对于理解活细胞很重要。我们提出了一种用于挖掘标记有序树的概率模型,并且我们进一步基于EM算法为该模型提出了一种有效的学习算法。该算法的时间和空间复杂度非常好,处于各种现有概率模型(包括随机无上下文语法)设置的实际限制之内。实验结果表明,在有监督问题的情况下,所提出的方法在所有情况下均具有统计学显着性,其性能优于其他五种竞争方法。我们进一步将提出的方法应用于对齐多个聚糖树,并在这些对齐中检测到生物学上重要的常见亚树,其中这些树被自动分类为糖生物学中已知的亚型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号