首页> 外文会议>IEEE Symposium on Computers and Communications >Weighted Graph Constraint and Group Centric Non-negative Matrix Factorization for Gene-phenotype Association Prediction
【24h】

Weighted Graph Constraint and Group Centric Non-negative Matrix Factorization for Gene-phenotype Association Prediction

机译:基因表型关联预测的加权图约束和基团非负矩阵分解

获取原文

摘要

Gene-phenotype association prediction can be applied to reveal the inherited basis of human diseases and help drug development. Gene-phenotype associations are related to complex biological process and influenced by various factors, such as relationship between phenotypes and that among genes. While due to sparseness of curated gene-phenotype associations, existing approaches are limited to prediction accuracy. In this paper, we propose a novel method by exploiting weighted graph constraint learned from hierarchical structures of phenotype data and group prior information among genes by inheriting advantages of Non-negative Matrix Factorization (NMF), called Weighted Graph Constraint and Group Centric Non-negative Matrix Factorization (GC~2NMF). Specifically, firstly we introduce the depth of parent-child relationships between two adjacent phenotypes in hierarchal phenotypic data as weighted graph constraint for a better phenotype understanding. Secondly, we utilize intra-group correlation among genes in a gene group as group constraint for gene understanding. Such information provides us an intuitive priori that genes in a group probably result in similar phenotypes. The model allows not only to achieve a high prediction performance but also jointly to learn interpretable representation of genes and phenotypes to handle future biological analysis. Experimental results on biological gene-phenotype association datasets of mouse and human demonstrate that GC~2NMF can obtain superior prediction accuracy and good understandability for biological explanation over other state-of-the-art methods.
机译:基因表型关联预测可以应用于揭示人类疾病的遗传基础,有助于药物发育。基因 - 表型关联与复杂的生物过程有关,受各种因素的影响,例如表型之间的关系和基因之间的关系。虽然由于愈合的基因表型关联的稀疏性,但现有方法仅限于预测精度。在本文中,我们通过继承非负矩阵分解(NMF)的优点来利用基因型数据和组的分层结构中学到的加权图约束来提出一种新方法,称为加权图约束和基团非负数矩阵分解(GC〜2nmf)。具体地,首先,我们在分层表型数据中介绍两个相邻表型之间的亲子关系的深度,作为加权图限制以获得更好的表型理解。其次,我们利用基因组中基因的基因内相关性作为基因谅解的群体约束。此类信息为我们提供了一种直观的先验,该组中的基因可能导致类似的表型。该模型不仅可以实现高预测性能,还可以共同学习可解释的基因和表型以处理未来的生物分析。小鼠和人类生物基因表型关联数据集的实验结果证明了GC〜2nMF可以获得优异的预测精度和对其他最先进的方法的生物学解释的良好可理解性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号