Systems, methods, and computer readable storage medium with executable instructions for detecting outliers and hidden relationships in heterogeneous data sets are provided. Features of the invention pertain to design and operation of various predictive models that identify multivariate outliers and influential observations by recognizing systematic local relationships within heterogeneous data sets or subpopulations of heterogeneous data sets. Multivariate outliers and influential observations are identified by utilizing general distance metrics which are specific to and defined for any number of individual observations within heterogeneous data sets. Aspects of the invention may be applied to sets of data that are large and complex (e.g. loan portfolios, health insurance company data, homeland security profiles, etc.) or sets of data having a more-limited scope (e.g. medical or drug research, etc.).
展开▼