The present invention relates to a document classification system and to a method thereof, and more specifically, to an accurate document classification method by feature information extraction and feature abstraction in consideration of relations with an anchor text and surrounding words. The purpose of the present invention is to provide a new feature extraction method in consideration of relations between words in an existing word feature extraction method for improving the classification performance of a hypertext document.
展开▼