...
首页> 外文期刊>BMC Bioinformatics >Missing value imputation for microRNA expression data by using a GO-based similarity measure
【24h】

Missing value imputation for microRNA expression data by using a GO-based similarity measure

机译:使用基于GO的相似性度量值对microRNA表达数据进行缺失值估算

获取原文
           

摘要

Missing values are commonly present in microarray data profiles. Instead of discarding genes or samples with incomplete expression level, missing values need to be properly imputed for accurate data analysis. The imputation methods can be roughly categorized as expression level-based and domain knowledge-based. The first type of methods only rely on expression data without the help of external data sources, while the second type incorporates available domain knowledge into expression data to improve imputation accuracy. In recent years, microRNA (miRNA) microarray has been largely developed and used for identifying miRNA biomarkers in complex human disease studies. Similar to mRNA profiles, miRNA expression profiles with missing values can be treated with the existing imputation methods. However, the domain knowledge-based methods are hard to be applied due to the lack of direct functional annotation for miRNAs. With the rapid accumulation of miRNA microarray data, it is increasingly needed to develop domain knowledge-based imputation algorithms specific to miRNA expression profiles to improve the quality of miRNA data analysis. We connect miRNAs with domain knowledge of Gene Ontology (GO) via their target genes, and define miRNA functional similarity based on the semantic similarity of GO terms in GO graphs. A new measure combining miRNA functional similarity and expression similarity is used in the imputation of missing values. The new measure is tested on two miRNA microarray datasets from breast cancer research and achieves improved performance compared with the expression-based method on both datasets. The experimental results demonstrate that the biological domain knowledge can benefit the estimation of missing values in miRNA profiles as well as mRNA profiles. Especially, functional similarity defined by GO terms annotated for the target genes of miRNAs can be useful complementary information for the expression-based method to improve the imputation accuracy of miRNA array data. Our method and data are available to the public upon request.
机译:缺失值通常出现在微阵列数据资料中。代替丢弃表达水平不完整的基因或样品,需要正确估算缺失值以进行准确的数据分析。插补方法可以大致分为基于表达水平和基于领域知识的分类。第一种方法仅依靠表达式数据而无需外部数据源的帮助,而第二种方法将可用的领域知识合并到表达式数据中以提高插补精度。近年来,microRNA(miRNA)芯片已得到广泛开发,并用于在复杂的人类疾病研究中鉴定miRNA生物标记。与mRNA谱相似,具有缺失值的miRNA表达谱可以用现有的插补方法进行处理。但是,由于缺少miRNA的直接功能注释,因此很难应用基于领域知识的方法。随着miRNA微阵列数据的快速积累,越来越需要开发特定于miRNA表达谱的基于领域知识的插补算法,以提高miRNA数据分析的质量。我们通过其靶基因将miRNA与基因本体论(GO)的领域知识相连接,并根据GO图中GO项的语义相似性定义miRNA功能相似性。在缺失值的估算中使用了一种结合了miRNA功能相似性和表达相似性的新方法。这项新措施在来自乳腺癌研究的两个miRNA芯片数据集上进行了测试,与在两个数据集上基于表达的方法相比,其性能得到了改善。实验结果表明,生物学领域知识可以有益于miRNA谱以及mRNA谱中缺失值的估计。特别地,由注释有miRNA靶基因的GO术语定义的功能相似性可能是有用的补充信息,可用于基于表达的方法,以提高miRNA阵列数据的估算准确性。我们的方法和数据可应要求向公众公开。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号