Clinical and genetic data from the Autism Genome Project (AGP) were used in this study. Clinical data analysis processing: clinical data comprise reports of ASD diagnosis and neurodevelopmental assessment instruments. Agglomerative hierarchical clustering (AHC) was used to identify clinically similar subgroups of individuals in stable, validated clusters, defined by multiple clinical measures. CNV data processing: rare high-confidence CNVs previously identified by the AGP, targeting brain-expressed genes, were retained for analysis. CNV data were merged with clinical data from clustered ASD subjects for a final list of CNVs targeting brain genes. Functional annotation analysis: biological processes defined by brain-expressed genes targeted by CNVs were obtained by using g:Profiler. Classifier design: a Naive Bayes machine-learning classifier was trained and tested on patient’s data, to predict the phenotypic clustering of patients from biological processes disrupted by rare CNVs targeting brain-expressed genes.
展开▼