The RAND Database of Worldwide Terrorism Incidents (RDWTI) seeks to index information about all terrorist incidents that occur and are mentioned in worldwide news media, providing a useful resource for policy researchers and decision makers. We examined automated classification methods that could be used to identify news articles about terrorist incidents, thus enabling analysts to read a smaller number of news articles and maintain the database with less effort and cost. The support vector machine (SVM) and Lasso methods were only modestly successful, but a classifier based on the gradient boosting method (GBM) appeared to be very successful, correctly ranking 80% of the relevant articles at the “top of the pile” for examination by a human analyst.
展开▼