Considering the rapidly increasing mass of information on the Web, the quality of documents is a very critical issue in Web information retrieval. This paper presents the importance of surface linguistic features in predicting the quality of user generated documents. A machine learning approach to incorporating surface linguistic features in predicting of document quality is tested on a collection of answers gathered from a community-driven knowledge search service that allows users to ask and answer questions posed by other users. Experimental results show that the features are useful for predicting the quality of answers.
展开▼