Techniques for analysis of gene expression data contained in real world data and real word evidence for assessing biologic pathways for identifying molecular subtypes are provided. Systems and methods include, for a plurality of biological pathways, determining a pathway score using gene expression data and determining of summary score for the plurality of biological pathways. That summary score may be compared to one or more enrichment scores each associated with a pre-determined molecular subtype. A molecular subtype is determined based on that comparison. Various heuristics may be applied to filter pathways before summary scoring. Additionally, techniques diagnose HER2 status for a patient, by identifying discordant HER2 status result between the HER2 status from immunohistochemistry (IHC) and the HER2 status from fluorescence in-situ hybridization (FISH) and diagnosing HER2 status based gene expression data.
展开▼