Vector space model is commonly used in the formal representation on text, but this approach would not highlight the features which play a key role in the text contents. An improved feature selection method based on key words was proposed, which uses text structural information and mutual information theory to extract key words on text content. Through using support vector machine (SVM) classifier to test, results showed that classification accuracy has improved significantly.
展开▼