Advances in technologies and data collection processes have resulted in multiple high dimensional data types being measured on the same subjects. For instance, in biomedical research, these data types include genomics, metabolomics, proteomics, and transcriptomics. While each of these data types provide a different snapshot of the underlying biological system, it is being increasingly recognized that combining these data types can reveal complex relationships that may not be unraveled from individual analyses. For instance, the integration of genomic and metabolomic/proteomic data can provide valuable insight into key genomic loci that influence human plasma levels associated with complex diseases [1]. This is of great interest because genomic studies including genome wide association studies (GWAS) have revealed that the majority of disease-causing single nucleotide polymorphisms (SNPs) lie in noncoding regions of the genome [2], making it difficult to know their functional implications. While individual genomic variants identified through GWAS can be tested experimentally, this approach is complicated by the modest effects of the identified variants and the fact that we may not know the specific gene driving the genomic association [3]. Integration of genomics data with other omics data can therefore enable us to identify genomic variants that could generate hypotheses for the genomic architecture of the underlying disease, or could identify variants that have the potential to improve clinical factors. Since the metabolome is considered as the end product of all genetic, epigenetic, and environment activities [4, 5], linking metabolite levels in human blood samples with genomics data can help shed light on complex disease-causing genomic variants. Additionally, tying genomic variants to metabolite levels can identify metabolites that can be used as biomarkers or potential targets for drug discovery [1]. A review of studies that combine genomics and metabolomics data can be found in [3]. In a recent study [1], genomics data were linked with protein levels known to be associated with cardiovascular disease (CVD) and many new gene locus-protein associations were unraveled, providing new insight into CVD risk pathophysiology [1].
展开▼