To identify outlier risks, a risk assessment is received from a first computer, and the risk assessment comprises a plurality of risks and each risk comprises a plurality of words and a plurality of attributes. A risk category associated with the risk assessment is received from a second computer, and the risk category is based on the plurality of words and the plurality of attributes and the risk category is a selected one of a high risk category and a not-high risk category. A word count is calculated for each word in each risk category. A probability score is also calculated for each word to generate a plurality of probability scores associated with the risk, and a risk score is calculated for each risk and is based on the plurality of probability scores associated with the risk. A distribution is generated that indentifies the high risk category and the not-high risk category, and the distribution identifies the risk score in the associated risk category. It is determined whether the risk associated with the risk score is an outlier for the associated risk category.
展开▼