首页> 美国卫生研究院文献>Genome Biology >Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy
【2h】

Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy

机译:使用组合的功能网络/分类策略从基因组规模数据推断小鼠基因功能

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The complete set of mouse genes, as with the set of human genes, is still largely uncharacterized, with many pieces of experimental evidence accumulating regarding the activities and expression of the genes, but the majority of genes as yet still of unknown function. Within the context of the MouseFunc competition, we developed and applied two distinct large-scale data mining approaches to infer the functions (Gene Ontology annotations) of mouse genes from experimental observations from available functional genomics, proteomics, comparative genomics, and phenotypic data. The two strategies — the first using classifiers to map features to annotations, the second propagating annotations from characterized genes to uncharacterized genes along edges in a network constructed from the features — offer alternative and possibly complementary approaches to providing functional annotations. Here, we re-implement and evaluate these approaches and their combination for their ability to predict the proper functional annotations of genes in the MouseFunc data set. We show that, when controlling for the same set of input features, the network approach generally outperformed a naïve Bayesian classifier approach, while their combination offers some improvement over either independently. We make our observations of predictive performance on the MouseFunc competition hold-out set, as well as on a ten-fold cross-validation of the MouseFunc data. Across all 1,339 annotated genes in the MouseFunc test set, the median predictive power was quite strong (median area under a receiver operating characteristic plot of 0.865 and average precision of 0.195), indicating that a mining-based strategy with existing data is a promising path towards discovering mammalian gene functions. As one product of this work, a high-confidence subset of the functional mouse gene network was produced — spanning >70% of mouse genes with >1.6 million associations — that is predictive of mouse (and therefore often human) gene function and functional associations. The network should be generally useful for mammalian gene functional analyses, such as for predicting interactions, inferring functional connections between genes and pathways, and prioritizing candidate genes. The network and all predictions are available on the worldwide web.
机译:小鼠基因的完整集合以及人类基因的集合,在很大程度上仍未鉴定出来,有关该基因的活性和表达的许多实验证据正在积累,但是大多数基因至今仍具有未知的功能。在MouseFunc竞争的背景下,我们开发并应用了两种截然不同的大规模数据挖掘方法,以从可用功能基因组学,蛋白质组学,比较基因组学和表型数据的实验观察中推断出小鼠基因的功能(基因本体论注释)。两种策略-第一种使用分类器将特征映射到注释,第二种策略将特征基因的注释传播到由特征构建的网络中沿边缘的未表征基因-为提供功能注释提供了可能的替代方法。在这里,我们重新实现和评估这些方法及其组合,以预测MouseFunc数据集中基因的适当功能注释。我们证明了,当控制相同的输入特征集时,网络方法通常胜过朴素的贝叶斯分类器方法,而它们的组合相对于其中任一方法都提供了一些改进。我们对MouseFunc竞争支持集以及MouseFunc数据的十倍交叉验证进行了预测性能的观察。在MouseFunc测试集中的所有1,339个带注释的基因中,中位预测能力非常强(接收器工作特征图下的中位数面积为0.865,平均精度为0.195),这表明基于挖掘数据的现有数据策略是一条有希望的道路发现哺乳动物基因功能。作为这项工作的产物,产生了功能性小鼠基因网络的高可信度子集-涵盖了> 70%的小鼠基因,具有> 160万个关联性-可以预测小鼠(因此通常是人类)的基因功能和功能性关联。该网络通常应可用于哺乳动物基因功能分析,例如预测相互作用,推断基因与途径之间的功能连接以及确定候选基因的优先级。该网络和所有预测都可以在全球Web上找到。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号